Lattice reduction in orthogonal time frequency space modulation

ABSTRACT

Methods, systems and devices for lattice reduction in decision feedback equalizers for orthogonal time frequency space (OTFS) modulation are described. An exemplary wireless communication method, implementable by a wireless communication receiver apparatus, includes receiving a signal comprising information bits modulated using OTFS modulation scheme. Each delay-Doppler bin in the signal is modulated using a quadrature amplitude modulation (QAM) mapping. The method also includes estimating the information bits based on an inverse of a single error covariance matrix of the signal, with the single error covariance matrix being representative of an estimation error for all delay-Doppler bins in the signal.

CROSS-REFERENCE TO RELATED APPLICATIONS

This patent document claims priority to and benefits of U.S. Provisional Patent Application No. entitled “LATTICE REDUCTION IN OTFS DECISION FEEDBACK EQUALIZER” and filed on 6 Sep. 2017. The entire content of the aforementioned patent application is incorporated by reference as part of the disclosure of this patent document.

TECHNICAL FIELD

The present document relates to wireless communication, and more particularly, to signal reception schemes used in wireless communication.

BACKGROUND

Due to an explosive growth in the number of wireless user devices and the amount of wireless data that these devices can generate or consume, current wireless communication networks are fast running out of bandwidth to accommodate such a high growth in data traffic and provide high quality of service to users.

Various efforts are underway in the telecommunication industry to come up with next generation of wireless technologies that can keep up with the demand on performance of wireless devices and networks.

SUMMARY

This document discloses techniques that can be used to implement receivers that receive orthogonal time frequency space (OTFS) modulated symbols and recover information bits from these symbols based, in part, on lattice reduction techniques for OTFS modulation.

In one example aspect, a method for wireless communication implementable at a receiver device is disclosed. The method includes receiving a signal comprising information bits modulated using an orthogonal time frequency space (OTFS) modulation scheme, where each delay-Doppler bin in the signal is modulated using a quadrature amplitude modulation (QAM) mapping, and estimating the information bits based on an inverse of a single error covariance matrix of the signal, where the single error covariance matrix is representative of an estimation error for all delay-Doppler bins in the signal.

In another example aspect, a wireless communication apparatus that implements the above-described method is disclosed.

In yet another example aspect, the method may be embodied as processor-executable code and may be stored on a non-transitory computer-readable program medium.

These, and other, features are described in this document.

DESCRIPTION OF THE DRAWINGS

Drawings described herein are used to provide a further understanding and constitute a part of this application. Example embodiments and illustrations thereof are used to explain the technology rather than limiting its scope.

FIG. 1 shows an example of a wireless communication system.

FIG. 2 is a block diagram of an example implementation of Decision Feedback Equalization (DFE) in an OTFS wireless receiver apparatus.

FIG. 3 shows an example of lattice expansion.

FIG. 4 shows an example of a unipotent change of basis.

FIG. 5 shows an example of a flip change of basis.

FIG. 6 shows an example of a change of realization transform of a standard rank two real lattice.

FIGS. 7A and 7B show examples of a Babai approximation with respect to a non-reduced basis and a reduced basis, respectively.

FIG. 8 shows an example of a sphere decoding combinatorial tree.

FIG. 9 is a flowchart of an example method of wireless signal reception.

FIG. 10 is a block diagram of an example of a wireless communication apparatus.

DETAILED DESCRIPTION

To make the purposes, technical solutions and advantages of this disclosure more apparent, various embodiments are described in detail below with reference to the drawings. Unless otherwise noted, embodiments and features in embodiments of the present document may be combined with each other.

Section headings are used in the present document to improve readability of the description and do not in any way limit the discussion or the embodiments to the respective sections only.

1 Overview of Multiple Input Multiple Output (MIMO) Systems

One of the fundamental attributes of a Multiple Input Multiple Output (MIMO) system is the data transmission rate increase it offers for a given bandwidth. This is achieved by sending several data streams in parallel. Data streams can be coded into parallel stream of QAM symbols and can be transmitted simultaneously from different transmit antennas that are kept appropriately separated. By receiving these signals on a group of receive antenna, these vector of QAM symbols can be detected correctly and thus the transmitted bit streams can be recovered correctly. The capacity increase is proportional to either a) the number of transmit antenna or b) the number of receive antenna, whichever is minimum.

At the receiver, the received QAM vector is typically jointly detected. Optimal QAM vector detection for MIMO channels is extremely computationally intensive. To the point that for reasonable antenna configurations (for example, 4×4 antennas transmitting QAM 64), highly sub-optimal QAM detectors are currently used in practice.

In orthogonal time frequency space (OTFS) modulation, due to the delay-Doppler domain organization of transmitted symbols, all transmitted QAMs vectors experience the same channel. Therefore, after equalization, all QAM vectors will have the same error covariance matrix, denoted by R_(ee). This property is mathematically different from traditional orthogonal frequency division multiplexing (OFDM) based transmissions such as LTE systems because each QAM vector in the traditional OFDM schemes experiences a different channel and therefore the residual error matrix for each QAM vector would be typically different.

Embodiments of the disclosed technology exploit this property of OTFS modulation at receivers to apply a single pre-processing algorithm to R_(ee). After pre-processing the R_(ee), a computationally inexpensive QAM detection algorithm (for example, a Babai detector) can be effectively used for all transmitted QAM vectors. The pre-processing algorithm includes performing lattice reduction, as is disclosed in the present document.

FIG. 1 depicts an example of a wireless communication system 100 where the techniques described in the present document may be embodied. The system 100 may include one or more Base Stations (BS) and one or more User Equipment (UE) 102. The device 102 (marked as Receiver) may be user devices embodied in various forms such as a mobile phone, a smartphone, a tabled, a computer, and so on. The BS (marked as transmitter) may emit downlink transmissions to the UE, 102. The UE (102) may transmit uplink transmissions to the BS (not explicitly shown in the figure). These transmissions may experience distortion and multipath delays due to objects such as moving vehicles, buildings trees and so on. Similarly, the UEs may be moving, and therefore may add Doppler distortion to the received signal at the UE and BS. The methods described in the present document may be implemented by either the UE receiver or BS receiver when receiving transmissions from the BS transmitter or UE transmitter.

1.1 MIMO Channel Detection Problem

Consider a MIMO channel. Assume that QAM signals transmitted on the MIMO channel have unit energy and are represented by:

y=Hx+w   (1.1)

Table 1 below, explains some of the variables used in the sequel.

TABLE 1 Wireless Object Notation Mathematical Object Number of transmit L_(t) A positive integer antennas Number of receive L_(r) A positive integer antennas MIMO channel H A L_(r) × L_(t) complex matrix Vector of x A L_(t) × 1 complex vector transmitted QAMs Variance of noise σ_(w) ² A positive scaler White noise w A 

 (0, R_(ww)) random variable Received vector y A L_(r) × 1 complex vector

1.2 MMSE Receive Equalizer and Slicer Based QAM Detection

A minimum mean square estimate (MMSE) equalizer finds the most likely transmitted vector in the mean square error sense. This algorithm is typically implemented under the assumption that the noise in the system can be modelled as Additive White Gaussian Noise (AWGN).

If MMSE receive equalizer is denoted by C, then it can be shown that:

C=(σ_(w) ⁻² H*H+I)⁻¹ H*σ _(w) ⁻²   (1.2)

Applying the MMSE receive equalizer to the received vector gives a vector of soft estimates, denoted as x_(s), of the transmit vector:

x_(s=Cy)   (1.3)

A simple symbol detection mechanism can be implemented using a commonly known slicer. This is achieved by mapping the soft-estimate to the nearest QAM constellation points. The vector of soft-estimates must be close to the transmitted QAM vector to ensure a true reception of the transmitted QAM vector.

1.2.1 Residual Error

Let e denote the difference between the soft estimate and the true transmitted vector:

e=x−x _(s)   (1.4)

Denote covariance of e by R_(ee). Then the theory of MMSE gives:

R _(ee)=(σ_(w) ⁻² H*H+I)⁻¹   (1.5)

1.2.2 ML Detection Criterion

In certain scenarios, no receiver can perform better than receivers that perform Maximum Likelihood (ML) based symbol detection. ML receivers, however, is computationally intractable, hence implementation of the same is very hard. Embodiments of the disclosed technology implement computationally efficient near-ML receiver for OTFS modulated signals.

The QAM estimation problem can be formulated as follows. The receiver needs to find the most likely transmitted vector of QAMs:

$\begin{matrix} {x_{ML} = {\underset{\overset{\sim}{x} \in {QAM}^{L_{t}}}{\arg \; \max}{p\left( \overset{\sim}{x} \middle| y \right)}}} & (1.6) \end{matrix}$

Here, the term QAM^(L) ^(t) denotes collection of QAM valued vectors of size L_(t).

1.2.3 Probability Approximations

For the channel model described in (1.1), the probability p({tilde over (x)}|y) can be well approximated by a Gaussian density function with mean x_(s) and covariance R_(ee).

$\begin{matrix} {{{p\left( \overset{\sim}{x} \middle| y \right)} \approx {{\left( {x_{s},R_{ee}} \right)}\left( \overset{\sim}{x} \right)}} = {\left( {2\pi} \right)^{- \frac{L_{t}}{2}}{R_{ee}}^{- \frac{1}{2}}{{\exp\left( {{- \frac{1}{2}}\left( {\overset{\sim}{x} - x_{s}} \right)^{*}{R_{ee}^{- 1}\left( {\overset{\sim}{x} - x_{s}} \right)}} \right)}.}}} & (1.7) \end{matrix}$

1.2.4 QAM Vector Detection Via Quadratic Minimization

Using the probability approximation, the following simplification can be performed:

$\begin{matrix} \begin{matrix} {x_{ML} = {\underset{\overset{\sim}{x} \in {QAM}^{L_{t}}}{\arg \; \max}{p\left( \overset{\sim}{x} \middle| y \right)}}} \\ {\approx {\underset{\overset{\sim}{x} \in {QAM}^{L_{t}}}{\arg \; \max}{\left( {x_{s},R_{ee}} \right)}\left( \overset{\sim}{x} \right)}} \\ {= {\underset{\overset{\sim}{x} \in {QAM}^{L_{t}}}{\arg \; \min}{{R_{ee}^{- 1}\left( {\overset{\sim}{x} - x_{s}} \right)}}^{2}}} \end{matrix} & (1.8) \end{matrix}$

The search space QAM^(L) ^(t) grows exponentially in L_(t). For example, when L_(t)=4, and QAM=64 (corresponding to 64 QAM), then

|QAM ^(Lt)|=16.77e6   (1.9)

It may be possible to use heuristics to speed up the search. However, these “short cuts” may not provide good results (or may not speed up calculations), particularly for ill conditioned channels. In conventional MIMO OFDM (e.g., LTE), a frequency block is divided into subcarriers. QAM vectors are transmitted across each of the subcarriers. Each transmitted QAM vector experiences a different channel. To recover the QAM vectors, a receiver may perform a separate QAM detection for each subcarrier. The computation complexity, above described, thus gets further boosted by the number of data sub-carriers in the system. This further forces the receiver to use suboptimal detection algorithms.

1.3 Preprocessing Stage for Near ML Detection of QAM Vectors

In MIMO OTFS the information to be transmitted is specified in the delay-Doppler domain. Namely, for each delay Doppler bin a vector of QAMs is transmitted.

Table 2 summarizes notation used with respect to MIMO OTFS signals.

TABLE 2 Wireless Object Notation Mathematical object Number of transmit L_(t) A positive integer. antennas Number of receive L_(r) A positive integer. antennas Number of delay N_(v) A positive integer. bins Number of Doppler N_(h) A positive integer. bins Vector of QAMs to X A L_(t) × N_(v) × N_(h) complex array. transmit

For simplicity, QAM vectors are assumed to have unit average energy. A QAM vector assigned to the delay-Doppler bin (τ,v) is denoted by x(τ,v).

A MIMO OTFS system can also be described by (1.1). Here, y denotes the received signal in the delay-Doppler domain, H denotes the channel in delay Doppler domain and w denotes the noise in the delay-Doppler domain. In a typical equalization structure in OTFS, feed-forward equalizer (FFE) is applied in the time-frequency domain and a 2D noise-predicting DFE is applied in the hybrid delay-time domain.

FIG. 2 discloses a block diagram of an example implementation of channel equalization in an OTFS receiver. In FIG. 2, Y denotes the received signal in the time frequency domain, x_(h) denotes the hard decisions, FFE denotes the feedforward equalizer, and FBE denotes the feedback filter, (I)FFT stands for (inverse) fast Fourier transform and RAM represents memory in which intermediate results are stored. Let x_(in)(τ,v) denote the L_(t)×1 dimensional vector (a set of soft symbols) at delay τ and Doppler v be input to the slicer. The error between the soft and hard symbols e_(in) is:

e _(in)(τ, v)=

(τ, v)−x _(in)(τ, v).   (1.10)

for all delay-Doppler bins (τ,v).

Due to spreading in OTFS the covariance matrix of the input error is constant across delay and Doppler frame. That is

[e _(in)(τ,v)e* _(in)(τ,v)]=

[e _(in)(τ′,v′)e* _(in)(τ′,v′)]  (1.11)

for all pairs of transmission delay-Doppler bins (τ, v) and (τ′,v′).

This means a single matrix defines or represents the error covariance for delay Doppler bins of a given transmission frame. Therefore, there exists an L_(t)×L_(t) matrix denoted R_(e) ^(in) such that

[e _(in)(τ,v)e* _(in)(τ,v)]=R _(ee) ^(in)   (1.12)

for all delay-Doppler bins (τ,v), (R_(ee) ^(in))⁻¹ can be obtained as a by-product of the DFE computation. As described earlier, the detection problem can be approximated with a quadratic minimization problem.

$\begin{matrix} {{{{{{x_{ML}\left( {\tau,v} \right)} \approx \underset{q \in {QAM}^{L_{t}}}{\arg \; \min}}}\left( R_{ee}^{i\; n} \right)^{- 1}\left( {{x_{i\; n}\left( {\tau,v} \right)} - q} \right)}}^{2},} & (1.13) \end{matrix}$

for all delay-Doppler transmission bins (τ,v).

If the matrix (R_(ee) ^(in))⁻¹ is perfectly conditioned, an ordinary slicer that slices each QAM symbol (of the QAM vector) co-ordinate by co-ordinate along the standard lattice of x_(in)(τ,v) is optimal. In other words, if condition number is close to one, then an ordinary slicer can be a near optimal detector.

In some embodiments, the condition of the matrix (R_(ee) ^(in))⁻¹ can be improved by performing a scheme called lattice reduction. Section 2 and Section 3 provide a theoretical framework and some implementation examples of lattice reduction algorithms.

As depicted in FIG. 3, a QAM constellation can be thought of as a small slice of an infinite lattice. The underlying lattice may be denoted as Λ. A matrix is called a unimodular matrix if all entries are integers and the determinant is one. If a matrix is unimodular, then its inverse is also unimodular. Furthermore, if U is a unimodular matrix of dimensions L_(t)×L_(t), then

v∈Λ^(L) ^(t) ⇒Uv∈Λ^(L) ^(t) .   (1.14)

For a unimodular matrix U, the pre-conditioned lattice detection problem is

$\begin{matrix} {{{{{{x_{ML}^{U}\left( {\tau,v} \right)} = \underset{q \in \Lambda^{L_{t}}}{\arg \; \min}}}\left( R_{ee}^{i\; n} \right)^{- 1}{U\left( {{U^{- 1}{x_{i\; n}\left( {\tau,v} \right)}} - q} \right)}}}^{2}.} & (1.15) \end{matrix}$

Here, q′=U⁻¹q.

There exist algorithms to find a unimodular matrix U that makes (R_(ee) ⁻¹) U well-conditioned. In some embodiments, a Lenstra-Lenstra-Lovász (LLL) lattice reduction algorithm may be used, and which is further detailed in Sections 2 and 3. Using these results, an OTFS QAM vector detection scheme may be implemented as follows.

1.4 Near ML Detection Using the Pre-Processed Error Covariance Matrix

First, a lattice reduction algorithm may be implemented to find a unimodular matrix U which makes (R_(ee) ⁻¹)U well-conditioned.

Next, for each delay-Doppler bin the U pre-conditioned detection problem is solved using a computationally inexpensive algorithm (e.g., a Babai detector as described in Section 2).

Then, the result is multiplied by U to get a near ML estimate of the transmitted QAM vector.

As observed earlier, in OTFS all transmitted QAM vectors experience the same channel. Therefore, after MMSE equalization all QAM vectors of a given frame have the same error covariance matrix, (denoted R_(ee)). This implies that a single pre-processing algorithm, such as lattice reduction, to R_(ee) ⁻¹ for a given OTFS frame. This brings significant computational advantages compared to OFDM. After pre-processing any of several well-known QAM detection algorithms can be effectively used for all transmitted QAM vectors.

One of skill in the art would appreciate that this technique cannot be used for standard OFDM systems where each transmitted QAM vector experiences a different channel and hence after equalization has a different residual error matrix.

However, it may be possible to reduce computational complexity in traditional OFDM receivers using some techniques described herein. For example, as described, in the case of OFDM systems, the number of error covariance matrices to pre-process will be equal to the number of data sub-carriers in the system. As was noted, the complexity of approximate ML detection in such systems is very high. However, some simplifications can be done to ease this computational burden. For example, R_(ee) from sub-carriers that have similar channel characteristics can be averaged to get a single average R_(ee). This entity can be taken as the representative R_(ee) for all those sub-carriers. This mechanism can bring down the computational load for OFDM systems, with possibly some degradation in the over-all performance.

2 An Exemplary Embodiment of a MIMO Turbo Equalizer 2.1 Based Hermitian Lattices

Gaussian Integers. We denote by

_(i) the ring of Gaussian integers, i.e., an element z∈

_(i) is a complex number of the form:

z=a+bi,   (2.1)

where a,b∈

. Gaussian integers constitute the complex generalization of the plain ring of integers

⊂

. All the constructions in this manuscript will be derived over

(

_(i)) with the understanding that they admit a simple counterpart over

(

). In particular, when considering a diagrammatic depiction we will restrict to

(

)

Euclidean metric. Let R∈

^(N×N) be a positive definite Hermitian matrix. The matrix R defines an Euclidean metric on

^(N) given by:

R(x,y)=x ^(H) Ry,   (2.2)

for every x,y∈

^(N). We denote by ∥−∥_(R) the norm associated with R, i.e.,:

∥x∥ _(R)=√{square root over (R(x,x),)}  (2.3)

for every x∈

^(N). Finally, we denote by R_(N) the standard Euclidean metric on

^(N) given by the unit matrix I_(N), i.e.,:

R _(N)(x,y)=x ^(H) y,   (2.4)

We denote the norm associated with R_(N) simply by ∥−∥. Finally, given a point x∈

^(N), we denote by B_(r)(x:R) the ball of radius r>0 around x with respect to the metric R, i.e.,:

B _(r)(x:R)={y∈

^(N) :∥y−x∥ _(R) <r}.   (2.5)

Based Hermitian lattices. A based Hermitian lattice (BHL for short) of rank N is a pair

=(G,R) where R is an Euclidean metric on

^(N) and G is an invertible matrix G∈

^(N×N). The matrix G defines a full rank Gaussian lattice Λ⊂

^(N) given by:

Λ={ξ[1]g ₁+ξ[2]g ₂+. . . +ξ[N]g _(N:ξ[) k]∈

_(i)},   (2.6)

where g_(k) denotes the kth column of G, i.e.,:

$\begin{matrix} {G = {\begin{bmatrix} | & | & | & | & | \\ g_{1} & g_{2} & \cdot & g_{N - 1} & g_{N} \\ | & | & | & | & | \end{bmatrix}.}} & (2.7) \end{matrix}$

Alternatively, the full rank Gaussian lattice can be expressed as:

Λ=G(

_(i) ^(N)),   (2.8)

that is, every element λ∈Λ is uniquely represented as λ=Gξ for some ξ∈

_(i) ^(N). We refer to the matrix G as a basis of the lattice Λ and to the integral vector ξ as the coordinates of λ with respect to the basis G. The standard example of a rank N based Hermitian lattice is

_(N)=(I,R_(N)). To summarize, a based Hermitian lattice is a lattice equipped with a basis and residing in a complex vector space equipped with an Euclidean metric. Based Hermitian lattices are objects in a category (in contrast to being points of a set). As consequence they admit multitude of isomorphic representations which we refer to as realizations. To transform from one realization to another one applies two basic operations: change of basis and change of coordinates.

Change of basis. A change of basis of

is a based Hermitian lattice

′=(G′,R) where G′=G∘T with T being an N×N matrix of determinant 1 with coefficients in the Gaussian ring

_(i). The matrix T is called change of basis transition matrix. One can verify that the determinant condition ensures that the inverse matrix T⁻¹ has also integral coefficients. We give two examples change of basis of

₂=(I₂,R₂). The first is called unipotent change of basis (see FIG. 4) and is realized by transition matrix of the form:

$\begin{matrix} {{T_{a} = \begin{bmatrix} 1 & a \\ 0 & 1 \end{bmatrix}},} & (2.9) \end{matrix}$

where a∈

_(i). The second change of basis is called a flip (see FIG. 5) and it is realized by the transition matrix:

$\begin{matrix} {T_{flip} = {\begin{bmatrix} 0 & 1 \\ {- 1} & 0 \end{bmatrix}.}} & (2.10) \end{matrix}$

In fact, one can show that every unimodular integral matrix T, det T=1, is a finite composition of flips and unipotent transition matrices. This will become important when we discuss algorithms for computing reduced bases.

Change of coordinates. A change of coordinates of

is a based Hermitian lattice

′=(G′,R′) where G′=A∘G and R′=A^(−H)RA⁻¹ with A being an N×N invertible matrix. The matrix A is called change of coordinates transition matrix. The matrix A defines an isometry between the Euclidean metrics R and R′, that is:

R′(Ax,Ay)=R(x,y),   (2.11)

for every x,y∈

^(N). There is a distinguished realization, called the upper triangular realization, where the basis matrix is upper triangular and the metric R=I_(N). The corresponding transition matrix is given by A=UG⁻¹, where U is the upper triangular factor in the Cholesky decomposition:

G^(H)RG=U^(H)U.   (2.12)

Furthermore, it may be verified that

$\begin{matrix} \begin{matrix} {{G^{\prime} = {{AG} = U}},} \\ {R^{\prime} = {A^{- H}{RA}^{- 1}}} \\ {= {U^{- H}G^{H}{RGU}^{- 1}}} \\ {= I_{N}} \end{matrix} & (2.13) \end{matrix}$

LLL reduced basis. A based Hermitian matrix

=(G,R) is called LLL reduced (LLL stands for Lenstra-Lenstra-Lovász) if, roughly speaking, the vectors of the basis G are orthogonal with respect to the metric R, i.e., R(g_(i),g_(j))=0, for every i≠j.

Dimension two. We denote by P₁ the orthogonal projection on the one dimensional subspace V₁=

g₁ and respectively by P₁ ^(⊥)=I−P₁ the orthogonal projection on the complement subspace. The formula of P₁ is given by:

$\begin{matrix} {{{P_{1}(x)} = {g_{1}\frac{R\left( {g_{1},x} \right)}{R\left( {g_{1},g_{1}} \right)}}},} & (2.14) \end{matrix}$

for every x∈

². Definition 1.1. We say that

is LLL reduced if it satisfies the following two conditions:

(1) Size reduction condition:

|ReR(g ₁ ,g ₂)|≤½R(g ₁ ,g ₁)

|ImR(g ₁ ,g ₂)|≤½R(g ₁ ,g ₁)   (2.15)

(2) Well ordering condition:

R(g ₁ ,g ₂)≤R(g ₂ ,g ₂)   (2.16)

The main statement in the two-dimensional theory is summarized in the following theorem:

Theorem 1.2 (Reduction theorem). If g₁,g₂ is LLL reduced basis then:

(1) The vector g1 satisfies:

√{square root over (R(g ₁ ,g ₁))}≤c√{square root over (R(λ_(short),λ_(short)),)}

-   -   where

$\begin{matrix} {{{P_{1}(x)} = {g_{1}\frac{R\left( {g_{1},x} \right)}{R\left( {g_{1},g_{1}} \right)}}},} & (2.14) \end{matrix}$

and λ_(short) denotes the shortest non-zero vector in Λ.

(2) The vector g₂ satisfies:

√R(P₁ ^(⊥) g ₂ ,g ₂)≤√R(P₁ ^(⊥λ,λ)),

-   -   for every vector λ∈Λ with non-zero orthogonal projection on V₁         ^(⊥).

In words, Theorem 1.2 asserts that the first basis vector g₁ is no longer than a scalar multiple of the shortest non-zero vector in the lattice where the scalar is universal (does not depend on the lattice) and that the second basis vector is the shortest non-zero vector mod g₁.

General dimension. Assuming

is of rank N, we denote by V_(n) the subspace spanned by the first n basis vectors, i.e., V_(n)=

(g₁,g₂, . . . g_(n)). We take the convention V₀={0}. The collection of subspaces V_(n), n=0, . . . N form a complete flag:

0=V ₀ ⊂V ₁ ⊂ . . . ⊂V _(N)=

^(N).   (2.17)

We denote by P_(n) the orthogonal projection (with respect to R) on the subspace V_(n). Respectively, we denote by P_(n) ^(⊥)=I−P_(n) the orthogonal projection on the orthogonal complement subspace. Definition 1.3. We say that (G,R) is LLL reduced if it satisfies the following two conditions:

(1) Size reduction condition:

|ReR(P _(n−1) ^(⊥) g _(n) ,g _(m))|≤½R(P _(n−1) ^(⊥) g _(n) ,g _(n))

|ImR(P _(n−1) ^(⊥) g _(n) ,g _(m))|≤½R(P _(n−1) ^(⊥) g _(n) ,g _(n)).

-   -   for every n=1, . . . N−1 and m>n.

(2) Well ordering condition:

R(P _(n−1) ^(⊥) g _(n) ,g _(m))≤R(P _(n−1) ^(⊥) g _(n+1) ,g _(n+1)).

-   -   for every n=1, . . . N−1.

When G is upper triangular and R=I, the LLL conditions take a particularly simple form. The size reduction condition takes the form:

|Reg _(nm)|≤½g _(nn)

|Img _(nm)|≤½g _(nn).

for every n=1, . . . N−1 and m>n. The ordering condition takes the form:

|g _(nn)|² ≤|g _(n,n+1)|² +|g _(n+1,n+1)|².

2.2 Hard/Soft Detection Problem

In this section, a delay Doppler convolution channel model is assumed:

y=h*x+w,   (2.18)

where x[τ,v]∈

^(N), y[τ,v]∈

^(M), h[τ,v]∈

^(M×N) and w is the noise term. We assume that the input variables x[τ,v] are independent random variables, taking values in a QAM constellation set Ω⊂

^(N), #Ω=2^(NQ).

Our ultimate goal is to calculate the finite a-posteriori probability distribution. Since this is formidable problem we approximate the prior probability distribution of x[τ,v] by a circular symmetric Gaussian distribution

(x[τ,v], R_(x[τ,v])). We denote by {circumflex over (x)}^(s)={circumflex over (x)}[τ,v] the MMSE estimation of the MIMO symbol x=x[τ,v]. The random variables {circumflex over (x)}^(s) and x are related through the backward and scaling equations:

$\begin{matrix} {\mspace{85mu} {{x = {\text{?} + \epsilon}},}} & (2.19) \\ {\mspace{85mu} {{{\hat{x}}^{s} = {{Ax} + {\left( {1 - A} \right)\overset{\_}{x}} + \text{?}}}{\text{?}\text{indicates text missing or illegible when filed}}}} & (2.20) \end{matrix}$

where e⊥{circumflex over (x)}^(s) and

⊥x. Furthermore, the parameters of the above equation related through:

$\begin{matrix} {\mspace{79mu} {A = {{I - {R_{e}{R_{x}^{- 1}.\mspace{20mu} R}\mspace{11mu} \text{?}}} = {{{AR}_{e}.\text{?}}\text{indicates text missing or illegible when filed}}}}} & (2.21) \end{matrix}$

The MMSE variable {circumflex over (x)}^(s) establishes a sufficient statistics for x, moreover the a-posteriori probability distribution is given by:

$\begin{matrix} {\mspace{85mu} {{P\left( {x\text{?}} \right)} \propto {{{\exp \left( {- {R_{e}^{- 1}\left( {\text{?} - {x \cdot {\hat{x}}^{s}} - x} \right)}} \right)}.\text{?}}\text{indicates text missing or illegible when filed}}}} & (2.22) \end{matrix}$

Roughly speaking, we wish to find a small subset of Ω that faithfully represents (2.22). We refer to this problem as the soft MAP detection problem. The expression in the exponent of (2.22) suggests to define the solution in geometric terms using the Euclidean metric R=R_(e) ⁻¹. To this end, we introduce the following notion:

Definition 2.1. An Ω-ball of size K around the point {circumflex over (x)}^(s) where K∈

^(≥1) is the finite set:

B _(K)({circumflex over (x)} ^(s) ,R,Ω)=B _(r(K))({circumflex over (x)} ^(s),R)∩Ω,   (2.23)

where r(K) the maximum radius such that |B_(r)({circumflex over (x)}^(s),R)∩Ω|≤K.

In plain language, (2.23) is the ball of maximum radius around {circumflex over (x)}^(s) that contains at most K constellation points. In particular, when K=1 the ball consists of one element {circumflex over (x)}^(h) called the MAP detector, defined by:

{circumflex over (x)} ^(s)=arg min{∥{circumflex over (x)} ^(s) −x∥ _(R) :x∈Ω}.   (2.24)

The soft MAP detection problem is formulated as the problem of calculating the Ω-ball of size K for various choices of the parameter K. Note that (2.23) gives a precise mathematical definition of an optimal list of size K that approximates the a-posteriori probability distribution. In some embodiments, as the size of the list gets larger, the corresponding approximation gets better.

Extrinsic a-posteriori bit probabilities (LLR computation). Let L:Ω→

^(≥0) denote the total MIMO likelihood function defined through the scaling equation (2.20) as:

L(x)∞exp(−∥{circumflex over (x)} ^(s)−(Ax+(1−A) x )∥_(R) ²),   (2.25)

for every x∈Ω. Let b_(i) ^(n) denote the ith bit of the nth MIMO symbol where n=1, . . . N and i=1, . . . Q. Let P_(i) ^(n):{0,1}→

^(≥0) denote the prior probability distribution on the (i,n) bit. One can verify that the (exact) extrinsic a-posteriori probability on the (i,n) bit is given by:

$\begin{matrix} {{E_{i}^{n}(b)} = {\sum\limits_{{b:b_{i}^{n}} = b}{{L\left( {x = {m(b)}} \right)}{\prod\limits_{{({j,m})} \neq {({i,n})}}{{P_{j}^{m}\left( b_{j}^{m} \right)}.}}}}} & (2.26) \end{matrix}$

where m:{0,1}^(QN)→Ω is the total mapping rule, converting between bits and constellation vectors.

Typically, in formula (2.26), the summation over all elements b∈{0,1}^(QN−1) is intractable for large values of N or/and Q. Instead, using the list

we can define the approximated likelihood function:

$\begin{matrix} {{\overset{\_}{L}(x)} = \left\{ {\begin{matrix} {L(x)} & {x \in {B_{K}\left( {{\hat{x}}^{s},R,\Omega} \right)}} \\ 0 & {x \notin {B_{K}\left( {{\hat{x}}^{s},R,\Omega} \right)}} \end{matrix}.} \right.} & (2.27) \end{matrix}$

Using this approximation, define the approximated extrinsic probability distribution by:

$\begin{matrix} {{{\overset{\_}{E}}_{i}^{n}(b)} = {\sum\limits_{{b:b_{i}^{n}} = b}{{\overset{\_}{L}\left( {x = {m(b)}} \right)}{\prod\limits_{{({j,m})} \neq {({i,n})}}{{P_{j}^{m}\left( b_{j}^{m} \right)}.}}}}} & (2.28) \end{matrix}$

The summation in (2.28) is over a much smaller set of size≤K. In order for (2.28) to be well posed, the set B_(K)({circumflex over (x)}^(s),R,Ω) should satisfy a mild technical requirement that for every (i,n) and bit value b∈{0,1} there exists a vector b∈{0,1}^(QN) such that:

b_(i) ^(n)=b

m(b)∈B_(K)({circumflex over (x)}^(s),R,Ω).   (2.29)

2.3 Examples of Lattice Relaxation

In some embodiments, computing (2.23) may be difficult since the search space Ω grows exponentially with the MIMO dimension N and the constellation order Q. This difficulty increases when the Euclidean metric R is highly correlated. One example of reducing complexity of the search problem is to embed Ω inside a bigger set that is better behaved (and is generally referred to as problem relaxation).

One way to do that is to embed Ω as subset of an infinite (shifted) lattice. Let Λ=

_(i) ^(N) denote the standard Gaussian lattice of rank N. Let v₀=(1+i)/2·1 where 1=(1, 1, . . . 1). One can easily verify that when Q is even. we have:

Ω⊂Λ+v₀.   (2.30)

We equip Λ with the based Hermitian lattice structure

=(G,R) where G=I_(N). We define {circumflex over (x)}={circumflex over (x)}^(s)−v₀. We consider discrete balls centered at the point {circumflex over (x)} parametrized either by their radius or their size: given a positive real number r>0, we define the

-ball of radius r around {circumflex over (x)} as the set:

B _(r)({circumflex over (x)},

)={ξ∈

_(i) ^(N) :∥{circumflex over (x)}−G(ξ)∥_(R) <r}.   (2.31)

In words, the set (2.31) consists of all lattice points inside a ball or radius r around the point {circumflex over (x)} or, more precisely, the coordinates of these points with respect to the standard basis.

Given a positive integer K∈N^(≥1), we define the

-ball of size K around {circumflex over (x)} as the set:

B _(K)({circumflex over (x)},

)=B _(r(K))({circumflex over (x)},

),   (2.32)

where r(K) the maximum radius such that |B_(r)({circumflex over (x)},

)|≤K, i.e.,:

r(K)=max{r:|B _(r)({circumflex over (x)},

)|≤K}.   (2.33)

In words, the set (2.32) is the

-ball of maximum radius that contains at most K points. In some embodiments, the relation B_(K)({circumflex over (x)}^(s),R,Ω)=B_(K)({circumflex over (x)},

)+v₀ is available. The solution of the soft lattice detection problem may be defined to be the ball B_(K)({circumflex over (x)},

) for various choices of the parameter K. One benefit, amongst others, in replacing (2.23) by (2.32) is that based Hermitian lattices (unlike constellation sets) admit a multitude of isomorphic realizations (or images). This advantageously enables a convenient realization for

to be chosen for computing (2.32), and this flexibility is one reason the lattice relaxation approach is employed.

2.4 Example of an Approximate Solution for Soft Lattice Detection

In some embodiments, an approximate solution of the soft lattice detection problem may be implemented efficiently. For example, the construction may use an LLL reduced upper triangular realization of the based Hermitian lattice

, which is denoted as

_(LR).

That is,

_(LR)=(U,I_(N)) where U=(u_(nm)) is an upper triangular matrix size reduced and well ordered. We further assume that we are given a change of basis T and a change of coordinates A such that U=A∘G∘T and I_(N)=A^(−H)∘R∘A⁻¹. We construct an approximation of the ball B_(K)(ŷ,

_(LR)) where ŷ=A{circumflex over (x)} and than derive an approximation of the ball B_(K)({circumflex over (x)},

) through the relation:

B _(K)({circumflex over (x)},

)=T(B _(K)(ŷ,

_(LR))).   (2.34)

We introduce a one parametric family of probability distributions on

_(i) ^(N). For any positive real number a>0 we define a probability distribution P_(a):

_(i) ^(N)→

^(≥0) expressed as multiplication of one dimensional conditional probability distributions, i.e.,:

$\begin{matrix} {{{P_{a}(\xi)} = {\prod\limits_{n = 1}^{N}{P_{a}\left( {\xi \lbrack n\rbrack} \middle| {\xi \left\lbrack {{n + 1}:N} \right\rbrack} \right)}}},} & (2.35) \end{matrix}$

for every ξ=(ξ[1],ξ[2] . . . ξ[N])∈

_(i) ^(N). The nth conditional probability is given by:

$\begin{matrix} {{P_{a}\left( {{\xi \lbrack n\rbrack} = {\text{?}\left| {\xi \left\lbrack {{n + 1}:N} \right\rbrack} \right.}} \right)} = {\frac{1}{s_{n}\left( {\xi \left\lbrack {{n + 1}:N} \right\rbrack} \right)}{{\exp \left( {{- a}{{{\overset{\Cap}{y}}_{n} - u_{{nn}^{2}}}}^{2}} \right)}.\mspace{79mu} {where}}}} & (2.36) \\ {\mspace{76mu} {{{\hat{y}}_{n} = {{\hat{y}\lbrack n\rbrack} - {\sum\limits_{m = {n + 1}}^{N}{{\xi \lbrack m\rbrack}u_{n\; m}}}}}\mspace{79mu} {{s_{n}\left( {\xi \left\lbrack {{n + 1}:N} \right\rbrack} \right)} = {\sum\limits_{z \in Z_{i}}{{{\exp \left( {{- a}{{{\hat{y}}_{n} - {u_{n,n} \cdot \text{?}}}}^{2}} \right)}.\text{?}}\text{indicates text missing or illegible when filed}}}}}} & (2.37) \end{matrix}$

We note that s_(n)(ξ[n+1:N]) is merely the normalization factor to ensure that the total probability is equal 1. Moreover, the complex number ŷ_(n) has geometric meaning: to explain it, we denote by P_(n) the orthogonal projection on the subspace V_(n)—spanned by the basis vectors u₁, . . . u_(n) which by the upper triangular structure coincides with the standard coordinate n dimensional subspace V_(n)={x:x[n+1:N]=0}. We have:

$\begin{matrix} {{{\hat{y}}_{n}e_{n}} = {P_{n - 1}^{\bot}{{P_{n}\left( {\hat{y} - {\sum_{n + 1}^{N}{{\xi \lbrack m\rbrack}u_{m}}}} \right)}.}}} & (2.38) \end{matrix}$

To get some intuition for the construction it is worthwhile to consider the limiting case when a→∞. One can see, that in the limit the nth conditional probability collapses to a deterministic delta supported at the element ξ_(babai)[n]=┌y_(n)/u_(nn)┘ where ┌z┘∈

_(i) denotes rounding to the the closest Gaussian integer to z. The vector ξ_(babai)=(ξ_(babai)[1], . . . ξ_(babai)[N]) is called the Babai lattice point and it can be interpreted as the solution of a successive cancellation detector. Fixing the value of the parameter K, we define the set:

B _(K)(a)={ξ:P _(a)(ξ)≥1/K}.   (2.39)

In words, the set B_(K)(a) consists of all points ξ∈

_(i) ^(N) whose probability is ≥1/K. Evidently, the number of elements in B_(K)(a) is ≤K since the total probability is equal 1. The upshot is that B_(K)(a) is a good approximation of the ball B_(K)(ŷ,

_(LR)) if the parameter a is chosen appropriately. The main technical statement is that the set B_(K)(a) contains a

_(LR)-ball of certain radius r which of course depends on the value of a and K. The precise statement is summarized in the following lemma: Lemma 4.1 (Technical Lemma). We have B_(r)(ŷ,ℒ_(LR))⊂B_(K)(a) where r=r(a,K) is given by:

$\begin{matrix} {{{r\left( {a,K} \right)}^{2} = {\min\limits_{m}{{u_{m\; m}}^{2}\left( {\frac{\ln (K)}{\ln (\rho)} - \frac{4N}{\rho \; {\ln (\rho)}}} \right)}}},} & (2.40) \end{matrix}$

with a=ln(ρ)/min_(m)|u_(mm)|².

Granting the validity of Lemma 4.1, it is natural to ask, for a given value of K, what is the optimal value of the parameter ρ that maximizes the radius r(a,K).

Denoting by ƒ(ρ) the expression on the right hand side of (2.40) and equating the derivative ƒ′(ρ₀)=0, it may be verified that ρ₀ and K are related through:

K=(ep ₀)^(4N/p) ⁰ .   (2.41)

Substituting ρ₀ back in ƒ we get the following formula for the optimal radius:

$\begin{matrix} \begin{matrix} {{r(K)}^{2} = {\min\limits_{m}{{u_{m\; m}}^{2}\left( {{\frac{4N}{\rho_{0}}\frac{{\ln \left( \rho_{0} \right)} + 1}{\ln \left( \rho_{0} \right)}} - \frac{4N}{\rho_{0}{\ln \left( \rho_{0} \right)}}} \right)}}} \\ {{= {\frac{4N}{\rho_{0}}{\min\limits_{m}{u_{m\; m}}^{2}}}},} \end{matrix} & (2.42) \end{matrix}$

To put things in order, it is convenient to introduce an additional parameter: for a given radius r>0, we define:

$\begin{matrix} {{G(r)} = {\frac{4r^{2}}{\min_{m}{u_{m\; m}}^{2}}.}} & (2.43) \end{matrix}$

To explain the meaning of this parameter we note that the expression ½ min_(m)|u_(mm)| is the Babai radius, that is if ∥ŷ−Uξ^(★)∥≤½ min_(m)|u_(mm)| then ξ_(babai)=ξ^(★) where ξ^(★) is the closest lattice point to ŷ. Granting this fact, G(r) can be interpreted as the radial gain of r over the Babai radius. For example if G(r)=2, we say r has 3 dB gain over the Babai radius. In practice, it is natural to fix the gain G and to ask for the value of the parameter K such that 4·r(K)²/min_(m)|u_(mm)|²=G.

Using the expression in (2.42), and the definition in (2.43), we can derive:

$\begin{matrix} {G = {4 \cdot {{r(K)}^{2}/{\min\limits_{m}{u_{m\; m}}^{2}}}}} \\ {= {\frac{16N}{\rho_{0}}.}} \end{matrix}$

This implies that ρ₀=16N/G. Hence, using (2.41), we get:

K=(16Ne/G)^(G/4).   (2.44)

From formula (2.44), the size parameter K as a function of the radial gain can be seen to grow as 0(N^(G/4)) with the MIMO order, and thus, to realize larger gains, larger list sizes may be employed. For example, to get a radial gain of 3 dB, we use K=√{square root over (8eN)} and ρ₀=8N.

2.5 Exemplary Deformation of the Sphere Decoder

In this section we describe a deformation of the standard sphere decoder that is specifically adapted to computing the set B_(K)(a). For the rest of this section we fix the value of the parameters K and a. Next we introduce some basic terminology related to the underlying combinatories of sphere decoding.

FIG. 8 shows an exemplary definition of a layered directed tree structure on the standard lattice

_(i) ^(N). The tree consists of N+1 levels.

For every number n=1, . . . N+1, we denote the nth level by L^(n) and it consists of elements ξ∈

_(i) ^(N) such that ξ[m]=0 for every m<n. In particular, the level L^(N+1) consists of one element which is the zero vector, called the root of the tree; on the other extreme, the level L¹ is equal

_(i) ^(N) and its elements are called leaves. The edges connect vertices in L^(n+1) with vertices in L^(n): given a pair of elements ξ^(n+1)∈L^(n+1) and ξ^(n)∈L^(n), the ordered pair (ξ^(n+1),ξ^(n)) is an edge if ξ^(n)[m]=ξ^(n+1)[m] for every m≥n+1.

Formula (2.36) may be used to introduce weights on the edges. For example, given an edge e^(n)=(ξ^(n+1),ξ^(n)), its weight may be defined as:

$\begin{matrix} \begin{matrix} {{\omega (e)} = {{- \ln}\; {P_{a}\left( {\xi^{n}\lbrack n\rbrack} \middle| \xi^{n + 1} \right)}}} \\ {= {{a{{{\hat{y}}_{n} - {u_{nn}{\xi^{n}\lbrack n\rbrack}}}}^{2}} + {\ln \mspace{14mu} {{s_{n}\left( \xi^{n + 1} \right)}.}}}} \end{matrix} & (2.45) \end{matrix}$

It may be observed that a leaf ξ∈L¹ belongs to the set B_(K)(a) if and only if:

$\begin{matrix} {{\sum_{n = N}^{1}{\omega \left( e^{n} \right)}} \leq {{\ln (K)}.}} & (2.46) \end{matrix}$

Here e^(N),e^(N−1), . . . ,e¹ are the edges along the unique branch leading from the root to the leaf ξ. Note that substituting a=1 in (2.45) and omitting the affine term ln s_(n)(ξ^(n+1)), we get the weights used by the conventional sphere decoder. In this regard, the new weights may be interpreted as a deformation of the sphere decoder structure. In some embodiments, and based on this weighted tree structure, the algorithm for computing the set B_(K)(a) proceeds as a depth first search algorithm along the vertices of the weighted tree.

2.6 Examples of Lattice Reduction Algorithm in Two Dimensions

Reduction algorithm—invariant form. The algorithm accepts as input a basis matrix G and produces as output an LLL reduced basis matrix G′ such that G′=G∘T, where T∈SL₂(

_(i)). Specifically, the algorithm constructs a sequence:

G=G⁰,G¹, . . . G^(K)=G′,   (2.47)

such that G^(k+1)=G^(k)∘T^(k) where T^(k)∈SL₂(

_(i)). Consequently, we have T=T₀∘T₁∘ . . . ∘T_(K−1).

The sequence (2.47) is constructed via alternating between the two basic transformations.

(1) Size reduction transformation. Reduce the size of the second basis vector with respect to the first basis vector according to the rule g₂ ^(k+1)=g₂ ^(k)−a·g₁ ^(k) where:

${a = \left\lbrack \frac{R\left( {g_{1}^{k},g_{2}^{k}} \right)}{R\left( {g_{1}^{k},g_{1}^{k}} \right)} \right\rbrack},$

-   -   This transformation is realized by the unimodular matrix:

$T^{k} = {\begin{bmatrix} 1 & {- a} \\ 0 & 1 \end{bmatrix}.}$

(2) Flipping transformation. Interchange the basis vectors according to the rule g₁ ^(k+1)=−g₂ ^(k) and g₂ ^(k+1)=g₁ ^(k). This transformation is realized by the unimodular matrix:

${T^{k} = \begin{bmatrix} 0 & 1 \\ {- 1} & 0 \end{bmatrix}},$

Reduction algorithm—upper triangular form. We describe a version of the LLL reduction algorithm which is adapted to the upper-triangular realization. The algorithm accepts as input an upper-triangular basis matrix U and produces as output an LLL reduced basis matrix U′ such that U′=A∘U∘T where T∈SL₂(

_(i)) and A∈U(

²). Specifically, the algorithm constructs a sequence of upper triangular matrices:

U=U⁰,U¹, . . . U^(K)=U′,   (2.48)

such that U^(k+1)=A^(k)∘G^(k)∘T^(k) where T^(k)∈SL₂(

_(i)) and A^(k)∈U(

²) (Unitary matrix). Consequently, we have T=T₀∘T₁∘ . . . ∘T_(K−1) and A=A_(K−1)∘ . . . ∘A₂∘A₁. In what follows, we use the notation u_(ij) ^(k) for the (i,j) coordinate of the matrix U^(k) and denote by u_(n) ^(k)=U^(k)(:,n) the nth vector of the kth basis.

The sequence of generators in (2.48) is constructed by alternating between the following two types of transformations.

(1) Size reduction transformation. Reduce the size of the second basis vector with respect to the first basis vector according to the rule u₂ ^(k+1)=u₂ ^(k)−au₁ ^(k) where:

$a = {\left\lbrack \frac{\langle{u_{1}^{k},u_{2}^{k}}\rangle}{\langle{u_{1}^{k},u_{1}^{k}}\rangle} \right\rbrack = {\left\lbrack \frac{u_{12}^{k}}{u_{11}^{k}} \right\rbrack.}}$

-   -   This transformation is realized by the unimodular matrix:

$T^{k} = {\begin{bmatrix} 1 & {- a} \\ 0 & 1 \end{bmatrix}.}$

(2) Flipping transformation. The basis vectors are flipped according to the rule g₂ ^(k+1)=−u₁ ^(k) and g₁ ^(k+1)=u₂ ^(k). This change of basis is realized by the unimodular matrix:

${T^{k} = \begin{bmatrix} 0 & 1 \\ {- 1} & 0 \end{bmatrix}},$

-   -   The resulting matrix G^(k+1)=U^(k)∘T^(k) is lower triangular and         is transformed back to upper triangular form by change of         realization A^(k)=Q, where Q is the Unitary factor in the QR         decomposition of G^(K+1). Alternatively, U^(k+1)=U—the upper         triangular multiplier in the Cholesky decomposition         G^(k+1H)G^(k+1)=U^(H)U.

Convergence of the algorithm. The convergence of the reduction algorithm can be proved by energy considerations. We define the energy functional:

E(G)=R(g ₁ ,g ₁)² R(P₁ ^(⊥) g ₂ ,g ₂).

Considering the sequence in (2.47), we have the following theorem:

Theorem 6.1. The following inequality always hold:

E(G ^(k+1))≤E(G ^(k)),

moreover, when G^(k) is flipped we have E(G^(k+1))≤αE (G^(k)) for some α<1.

2.7 Examples of Lattice Reduction Algorithm in Higher Dimensions

Reduction algorithm—invariant form. The invariant LLL reduction algorithm accepts as input a basis matrix G and produces as output an LLL reduced basis matrix G′ such that G″=G∘T where T∈SL_(N)(

_(i)). Specifically, the algorithm constructs a sequence:

G=G⁰,G¹, . . . G^(K) =G′,   (2.49)

such that G^(k+1)=G^(k)∘T^(k), T^(k)∈SL_(N)(

_(i)). Consequently, we have T=T₀∘T₁∘ . . . ∘T_(K−1).

The sequence (2.49) is constructed using two types of change of basis transformations.

(1) Size reduction transformation. Let m>n∈[1,N]. Reduce the size of the jth basis vector with respect to the ith basis vector according to the rule g_(m) ^(k+1)=g_(m) ^(k)−ag_(n) ^(k) where:

${a = \left\lbrack \frac{R\left( {{P_{n - 1}^{\bot}g_{n}^{k}},g_{m}^{k}} \right)}{R\left( {{P_{n - 1}^{\bot}g_{n}^{k}},g_{n}^{k}} \right)} \right\rbrack},$

-   -   Here [-] stands for the closest Gaussian integer. This         transformation is realized by the unimodular matrix:

${T^{k} = \begin{bmatrix} 1 & . & 0 & 0 & . & 0 \\ . & . & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & . & {- a} & 0 \\ 0 & 0 & 0 & . & . & 0 \\ . & 0 & 0 & 0 & 1 & . \\ 0 & 0 & 0 & 0 & . & 1 \end{bmatrix}},$

-   -   i.e., the unit matrix with additional non-zero entry         T^(k)(n,m)=−a.

(2) Flipping transformation. Let n∈[1, N−1]. Interchange the nth and the n+1th basis vectors according to the rule g_(n) ^(k+1)=−g_(n+1) ^(k) and g_(n+1) ^(k+1)=g_(n) ^(k). This transformation is realized by the unimodular matrix:

${T^{k} = \begin{bmatrix} 1 & . & 0 & 0 & . & 0 \\ . & . & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & {- 1} & 0 & 0 & 0 \\ . & 0 & 0 & 0 & . & . \\ 0 & 0 & 0 & 0 & . & 1 \end{bmatrix}},$

-   -   i.e., the unit matrix except that

${T^{k}\left( {{n:{n + 1}},{n:{n + 1}}} \right)} = {\begin{bmatrix} 0 & 1 \\ {- 1} & 0 \end{bmatrix}.}$

Remark. The algorithm applies the size reduction step only if the size reduction condition is not satisfied and applies the flipping transformation always after applying size reduction transformation and only if the well ordering condition is not satisfied.

Reduction algorithm—upper triangular form. The upper triangular version of the LLL reduction algorithm accepts as input an upper triangular basis matrix U and produces as output an LLL reduced upper triangular basis matrix U′ such that U′=A∘U∘T where T∈SL_(N)(

_(i)) and A∈U(

^(N)). In more details, the algorithm constructs a sequence of upper triangular matrices:

U⁰=U,U¹, . . . U^(K)=U′,   (2.50)

such that U^(k+1)=A^(k)∘U^(k)∘T^(k) where T^(k)∈SL_(N)(

_(i)) and A^(k)∈U(

^(N)). Consequently, T=T⁰∘T²∘ . . . ∘T^(K−1) and let A=A⁰∘A²∘ . . . ∘A^(K−1). In what follows, we use the notation u_(ij) ^(k) for the (i,j) coordinate of the matrix U^(k) and denote by u_(n) ^(k)=U^(k)(:,n) the nth vector of the kth basis.

The sequence (2.50) is constructed using two types of transformations, each being a combination of a change of basis with a change of realization.

(1) Size reduction transformation. For m>n, reduce the size of the mth basis vector u_(m) with respect to the nth basis vector u_(n) according to the rule u_(m) ^(k+1)=u_(m) ^(k)−au_(n) ^(k) where:

$a = {\left\lbrack \frac{\langle{{P_{n - 1}^{\bot}u_{n}^{k}},u_{m}^{k}}\rangle}{\langle{{P_{n - 1}^{\bot}u_{n}^{k}},u_{n}^{k}}\rangle} \right\rbrack = {\left\lbrack \frac{u_{n\; m}^{k}}{u_{nn}^{k}} \right\rbrack.}}$

-   -   This is realized by the unimodular matrix:

${T^{k} = \begin{bmatrix} 1 & . & 0 & 0 & . & 0 \\ . & . & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & . & {- a} & 0 \\ 0 & 0 & 0 & . & . & 0 \\ . & 0 & 0 & 0 & 1 & . \\ 0 & 0 & 0 & 0 & . & 1 \end{bmatrix}},$

-   -   i.e., the unit matrix with additional non-zero entry         T^(k)(n,m)=−a. To conclude:

U^(k+1)=U^(k)∘T^(k),

-   -   Note that the upper triangular size reduction transformation has         the same form as its invariant counterpart since U^(k)∘T^(k) is         already in upper triangular form.

(2) Flipping transformation. For n=1, . . . N−1, interchange between the nth and the n+1th basis vectors according to the rule g_(n) ^(k+1)=−u_(n+1) ^(k) and g_(n+1) ^(k+1)=u_(n) ^(k) followed by a change or realization to return to upper triangular form. The change or basis matrix is given by:

${T^{k} = \begin{bmatrix} 1 & . & 0 & 0 & . & 0 \\ . & . & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & {- 1} & 0 & 0 & 0 \\ . & 0 & 0 & 0 & . & . \\ 0 & 0 & 0 & 0 & . & 1 \end{bmatrix}},$

-   -   The change of realization matrix is defined as follows. Let         G^(k+1)=U^(k)∘T^(k). One can verify that G^(k+1) is almost upper         triangular except that L=G^(k+1)(n:n+1,n:n+1) is lower         triangular. Let L=QR be the Q−R decomposition of L where Q is         unitary and R upper triangular. We define:

$A^{k} = {\begin{bmatrix} 1 & . & 0 & 0 & . & 0 \\ . & . & 0 & 0 & 0 & 0 \\ 0 & 0 & q_{11} & q_{12} & 0 & 0 \\ 0 & 0 & q_{21} & q_{22} & 0 & 0 \\ . & 0 & 0 & 0 & . & . \\ 0 & 0 & 0 & 0 & . & 1 \end{bmatrix}.}$

-   -   i.e., A^(k) is the unit matrix except that A^(k)(n:n+1,n:n+1)=Q.         To conclude: U^(k+1)=A^(k)∘U^(k)∘T^(k). Alternatively, we can         define U^(k+1)=G^(k) except that U^(k+1)(n:n+1,n:n+1)=U where U         is the upper triangular factor in the Cholesky factorization         L^(H)L =U^(H)U.

3 Exemplary Implementations for Lattice Detection/Reduction

Typical constellations like QAM vectors can be naturally viewed as truncation of translated lattices in

^(N). This observation suggests the idea of relaxing the problem of finding the closest constellation point (aka CCP problem) to the problem of finding the closest corresponding lattice point (aka CLP problem) ignoring the truncation boundary conditions. One of the advantages of working with lattices over working with constellation sets is that lattices admit various bases and realizations. This freedom enables to device lower complexity detection algorithms. Two examples of such algorithms are the Heuristic Babai detector and the exact sphere detector. Both algorithms use a canonical weighted tree structure which is associated with the lattice.

3.1 Preliminaries from Linear Algebra Euclidean Geometry

Hermitian vector spaces. Euclidean geometry over the complex numbers study Hermitian vector spaces.

Definition An Hermitian vector space is a pair (V,R) where V is a vector space over (

) and R:V×V→

an Hermitian product.

We follow the convention that the first variable is skew linear, that is:

R(λv,λ′v′)=λλ′R(v,v′),

for every v,v′∈V and λ,λ′∈

. Hermitian products transform under linear maps. Given an Hermitian vector space (V,R) and a linear map A:U→V, one defines the pullback Hermitian product A*(R):U×U→

by the rule:

A*(R)(u,u′)=R(Au,Au′),

for every u,u′∈U. We note the relation with standard matrix definitions. When V=

^(N) the form R can be represented by an Hermitian matrix R so that:

R(x,y)=x ^(H) Ry,

for every x,y∈

^(N) where x^(H) denotes the row vector obtained as the complex conjugate of a column vector x. Moreover, if in addition U=

^(M) and A is an N×M matrix then:

A*(R)=A ^(H) RA,

where A^(H) stands for the conjugate transpose of A. To conclude, one should view Hermitian vector spaces as the coordinate free counterpart of Hermitian matrices. In the sequel we use the notation:

R(v)=R(v,v),

for every v∈V. We also use the notation R₀:

^(N)×

^(N)→

for the standard Hermitian product:

R ₀(x,y)=x ^(H) y,

for every x,y∈

^(N).

Duality. Hermitian vector spaces admit duals. Given an Hermitian vector space (V,R) there exists a canonical Hermitian form on the dual vector space V*. To define it, let {circumflex over (R)}:V→V* denote the canonical skew linear map sending a vector v∈V to the linear functional {circumflex over (R)}(v)∈V* defined by:

{circumflex over (R)}(v)(v′)=R(v,v′),

for every v′∈V. The map {circumflex over (R)} is a bijection since R is non-degenerate. The dual Hermitian product R^(d):V*×V*→

is characterized by the pullback equation:

{circumflex over (R)}*(R ^(D))=R⇔R ^(d) ={circumflex over (R)} ⁻¹*(R)   (3.1)

Note that R^(d) is skew linear in the second variable. Finally, we note the relation with standard matrix definitions. Assume V=

^(N). Let us denote by R₀ the standard inner product on

^(N). Assume that the Hermitian product R is represented by an Hermitian matrix:

R(x,y)=x ^(H) Ry,

for every x,y∈

^(N). Under these assumptions one can show that:

{circumflex over (R)}* ₀(R ^(d))(x,y)=x ^(H) R ⁻¹ y,   (3.2)

for every x,y∈

^(N). In words, the pullback of R^(d) under the map {circumflex over (R)}₀ is represented by the inverse of the matrix R. Remark. There is a nice interpretation of R and R^(d) in the framework of probability theory. Under this interpretation, the form R is the information matrix of a Gaussian random variable X∈V such that:

${{\mathbb{P}}\left( {X = x} \right)} \propto {e^{{- \frac{1}{2}}{R{({x,x})}}}.}$

The dual form R^(d) is the covariance matrix of X. One can show that:

${{R^{d}\left( {\alpha,\beta} \right)} = {\int\limits_{x \in V}{{\alpha (x)}\overset{\_}{\beta (x)}{{\mathbb{P}}\left( {X = x} \right)}{dx}}}},$

for every pair of linear functionals α,β∈V*. In words, R^(d)(α,β) is the covariance between any pair of scalar measurements of the random variable X. Note that in standard texts both R and R^(d) are given by Hermitian matrices satisfying R^(d)=R⁻¹. This convention is not canonical and relies on the existence of another Hermitian form R⁰ on V used to identify V and V*.

Orthogonal projection. Probably the most important notion in Euclidean geometry is that of orthogonal projection. Let (V,R) be a finite dimensional Hermitian vector space. Let U⊂V be a linear subspace. The orthogonal complement of U is denoted by U^(⊥) and is defined as:

U ^(⊥) ={v∈V:R(v,u)=0 for every u∈U}.

Every vector v∈V can be written in a unique manner as a sum:

v=P _(U)(v)+P _(U) _(⊥) (v),

where P_(U)(v)∈U and P_(U⊥)(v)∈U^(⊥). The rule v

P_(U)(v) gives a linear map P_(U):V→V called the orthogonal projection on U. Similarly, the rule v

P_(U) _(⊥) (v) gives a linear map P_(U) _(⊥) :V→V called the orthogonal projection on U^(⊥). One can verify the following identities:

P _(U) +P _(U) _(⊥) =Id

P_(U)∘P_(U)=P_(U),

P_(U) _(⊥) ∘P_(U) _(⊥) =P_(U) _(⊥) ,

P_(U) _(⊥) ∘P_(U)=P_(U)∘P_(U) _(⊥) =0.

In words, P_(U) and P_(U) _(⊥) form an orthogonal decomposition of the identity. Interestingly. P_(U) can be characterized by the geometric property:

${P_{U}(v)} = {\arg \; {\min\limits_{u}{\left\{ {{R\left( {v - u} \right)}:{u \in U}} \right\}.}}}$

for every v∈V. In words, P_(U)(v) is the vector in U that is the closest to v with respect to the Euclidean metric R. Remark. The notion of orthogonality and hence the definition of orthogonal projection depends very strongly on the Hermitian structure R. Note that even when V=

^(N) and U={(x₁, . . . x_(K),0, . . . 0):x_(i)∈

} is the standard coordinate subspace the orthogonal complement U^(⊥) might look very different from {(0, . . . 0,x_(K+1), . . . x_(N)):x_(i)∈

} unless the Hermitian product is the standard product:

R ₀(x,y)=x ^(H) y,

for every x,y∈

^(N).

Schur reduction. Let (V,R) be a finite dimensional Hermitian vector space. Let U⊂V be a linear subspace. We define a degenerate Hermitian product R_(U):V×V→

by:

R _(U)(v,v′)R(P _(U) v,v′).   (3.3)

The Hermitian product R_(U) is called the Schur reduction of R to the subspace U. One can verify that R_(U)(u,v)=0 for every u∈U^(⊥) and v∈V thus R_(U) reduces to a non-degenerate Hermitian product on the quotient vector space V/U^(⊥). Similarly, the Schur reduction with respect to U^(⊥) is given by:

$\begin{matrix} {{R_{U^{\bot}}\left( {v,v^{\prime}} \right)} = {R\left( {{P_{U^{\bot}}v},v^{\prime}} \right)}} \\ {= {{R\left( {v,v^{\prime}} \right)} - {{R_{U}\left( {v,v^{\prime}} \right)}.}}} \end{matrix}$

The Hermitian product R_(U) _(⊥) is sometimes called the Schur complement of R with respect to U; it is a degenerate Hermitian form that reduces to a non-degenerate Hermitian form on the quotient space V/U.

We note the relation to standard matrix formulation. Assume V=

^(N) and U={(x₁, . . . x_(K),0, . . . 0):x_(i)∈

}. Assume R is represented by an N×N Hermitian matrix R of the form:

${R = \begin{bmatrix} A & B \\ B^{H} & D \end{bmatrix}},$

where A, B and D are matrices of dimensions K×K, K×N X−K and N−K×N−K respectively. We assume A is invertible. The Schur reductions of R_(U) and R_(U) _(⊥) are represented respectively by the (singular) Hermitian matrices:

$\begin{matrix} {{R_{U} = \begin{bmatrix} A & B \\ B^{H} & {B^{H}A^{- 1}B} \end{bmatrix}},{R_{U^{\bot}} = {\begin{bmatrix} 0 & 0 \\ 0 & {D - {B^{H}A^{- 1}B}} \end{bmatrix}.}}} & (3.4) \end{matrix}$

Based Hermitian lattices. Let

_(i)=

⊕i

be the ring of Gaussian integers where i=√{square root over (−1)}. The main object of study in these notes is full rank lattices lying inside Hermitian vector spaces. Such objects lie at the intersection of lattice theory and Euclidean geometry. In fact, we will require in addition a choice of a particular basis of the lattice giving rise to the notion of a based Hermitian lattice.

Definition. A based Hermitian lattice over

_(i) is a triple (V,G,R) where:

-   -   V is a complex vector space of dimension N.     -   G is a linear isomorphism G:         ^(N)         V.     -   R is an Hermitian product R:V×V→         .

A based Hermitian lattice (V,G,R) defines in particular a lattice Λ=G (

_(i) ^(N)). Elements of Λ are

_(i) linear combinations of the vectors λ_(i)=G (e_(i)) where e_(i) is the standard ith basis vector of

^(N), that is:

Λ={a[1]λ₁ + . . . +a[N]λ_(N) :a[k]∈

_(i)}.

The vectors λ₁, . . . λ_(n)∈V form a basis of Λ. The map G is called the generator of Λ. The Hermitian vector space (V,R) is called the realization space of Λ. Remark. A based Hermitian lattice is a full rank lattice in an Hermitian vector space equipped with a specific choice of basis.

Two basic operations on based Hermitian lattices include:

1. Change of basis. Changing the basis of a based Hermitian lattice amounts to changing the generator matrix G. Let SL_(N)(

_(i)) denote the group of Gaussian unimodular N×N matrices, that is, the set consisting of N×N matrices of determinant 1 with coefficients in the ring

_(i) equipped with the operation of matrix multiplication (it can be shown that the inverse of a unimodular matrix is also unimodular).

Definition. A change of basis of a based Hermitian lattice (V,G,R) is a based Hermitian lattice of the form (V,G′,R) where G′=G∘T for some T∈SL_(N)(

_(i)).

2. Change of realization. Changing the realization of a based Hermitian lattice amounts to changing the realization space (V,R).

Definition. A change of realization of a based Hermitian lattice (V,G,R) is a based Hermitian lattice (V′,G′,R′) whew G′=A∘G and R=A*(R′) for some isomorphism A:V

V′.

Note that if V=V′=

^(N) then the forms R and R′ can be represented by Hermitian matrices and the pullback equation R−A*(R′) translates to the conjugation condition:

R=A^(H)R′A.

Triangular realizations. Let (V,G,R) be a based Hermitian lattice. There exists two special realizations. An upper triangular realization (

^(N),U,R₀) where U is an upper triangular matrix. The generator U is defined by the upper Cholesky decomposition:

G*(R)=U ^(H) U.

Similarly, there exists a lower triangular realization (

^(N),L,R₀) where L is a lower triangular matrix. The generator L is defined by the lower Cholesky decomposition:

G′*(R)=L ^(H) L.

We refer to an upper triangular realization of an Hermitian lattice as upper triangular Hermitian lattice, and a lower triangular realization of an Hermitian lattice is referred to as a lower triangular Hermitian lattice.

Standard filtrations. Let (V,G,R) be a based Hermitian lattice with basis λ_(i)=G(e_(i)),i=1, . . . N. We define two ascending filtrations of V. The lower filtration:

0=V₀⊂V₁⊂ . . . ⊂V_(N)=V,

where V_(n) the subspace of V spanned by the basis vectors λ₁, . . . λ_(n). The upper filtration:

0=V⁰⊂V¹⊂ . . . ⊂V^(N)=V,

where V^(n) is the subspace of V spanned by the basis vectors λ_(N−n+1), . . . λ_(N). In what follows, we use the notations [R]_(n) and [R]^(n) for the Schur complements of R with respect to V_(n) and V^(n) respectively.

Duality. Based Hermitian lattices admit duals. Let (V,G,R) be a based Hermitian lattice with basis λ₁=G(e_(i)),i=1, . . . N. Let λ*_(i)∈V*,i=1, . . . N denote the dual basis. Recall that λ*_(i) is defined by;

${\lambda_{i}^{*}\left( \lambda_{j} \right)} = \left\{ {\begin{matrix} 1 & {j = 1} \\ 0 & {otherwise} \end{matrix}.} \right.$

for every j∈[1,N]. The dual based Hermitian lattice is the triple (V*,G^(d),R^(d)) where the dual generator is defined by:

G ^(d)(e _(i))=λ*_(i),

for every i∈[1,N]. Note that the lattice Λ*=G^(d)(

_(i) ^(N)) is dual to Λ in the conventional sense, namely, Λ*={α∈V*:α(λ)∈

for every λ∈Λ}.

VBLAST reduced basis. Let (V,G,R) be a based Hermitian lattice. Let λ₁, . . . λ_(N)∈V be the corresponding basis where λ_(i)=G(e_(i)). The VBLAST reduction condition is expressed in terms of the dual lattice.

Definition. We say that λ₁, . . . λ_(N) is VBLAST reduced basis of Λ if it satisfies the following condition:

${N - n} = {\arg \; {\min\limits_{k}{\left\{ {{\left\lbrack R^{d} \right\rbrack^{n}\left( {\lambda_{k}^{*},\lambda_{k}^{*}} \right)}:{k \in \left\lbrack {1,{N - n}} \right\rbrack}} \right\}.}}}$

for n∈[0,N−1].

In words. λ₁, . . . λ_(N) is VBLAST reduced if the basis vectors are well ordered so that λ*_(N−n) has the minimum dual norm taken mod λ*_(N−n+1), . . . λ*_(N).

Remark. There is a nice interpretation of the VBLAST well ordering condition in the framework of Wiener theory of decision feedback equalization. Under this interpretation, the last among the first N−n symbols has the maximum SNR after substracting the interference from symbols N−n+1, . . . N. This implies that the VBLAST ordering is optimal with respect to error propagation effect.

VBLAST reduction algorithm. The VBLAST reduction algorithm accepts as input a based Hermitian lattice (V,G,R) and produces as output a based Hermitian lattice (V,G′,R) such that:

(1) G′=G∘T where T∈SL_(N)(

_(i)) -S a permutation matrix.

(2) λ′_(i)=G′(e_(i)),i=1, . . . N is a VBLAST reduced basis of Λ=G(

_(i) ^(N)).

The VBLAST change of basis permutation π:[1,N]→[1,N] is defined according to the following recursive formula. At step n∈[0,N−1].

-   -   Let V*_(π) ^(n)⊂V* denote the subspace spanned by λ*_(π(N−n+1)),         . . . λ*_(π(N)).     -   Let [R^(d)]_(λ) ^(n) denote the Schur complement of R^(d) with         respect to V*_(π) ^(n).

Define the value π(N−n) by:

${\pi \left( {N - n} \right)} = {\arg \; {\min\limits_{k}{\left\{ {{\left\lbrack R^{d} \right\rbrack_{\pi}^{n}\left( {\lambda_{k}^{*},\lambda_{k}^{*}} \right)}:{k \notin {\pi \left( \left\lbrack {{N - n + 1},N} \right\rbrack \right)}}} \right\}.}}}$

Consequently, the change of basis transformation is the matrix T obtained by permuting the columns of the identity according to π.

LLL (Lenstra-Lenstra-Lovász) reduced basis. One can view the LLL reduction condition as kind of a generalization of the VBLAST reduction condition explained in the previous subsection. Let (V,G,R) be a based Hermitian lattice. Let λ₁, . . . λ_(N)∈V be the corresponding basis where λ_(i)=G(e_(i)).

Definition. We say that λ₁, . . . λ_(N) is an LLL reduced basis of Λ if it satisfies the following two conditions:

(1) Size reduction condition:

|Re[R]_(i−1)(λ_(i),λ_(j))|≤½[R]_(i−1)(λ_(i),λ_(i)),

|Im[R]_(i−1)(λ_(i),λ_(j))|≤½[R]_(i−1)(λ_(i),λ_(i)),

-   -   for every i=1, . . . N−1 and j>i.

(2) Well ordering condition:

[R]_(i−1)(λ_(i),λ_(i))≤[R]_(i−1)(λ_(i+1),λ_(i+1)),

-   -   for every i=1, . . . N−1.

Note that the size reduction condition implies the inequality:

${{{\lbrack R\rbrack_{i - \lambda}\left( {\lambda_{i},\lambda_{j}} \right)}} \leq {{\frac{1}{\sqrt{2}}\lbrack R\rbrack}_{i - 1}\left( {\lambda_{i},\lambda_{j}} \right)}},$

for every i=1, . . . N−1 and j>i. When λ₁, . . . λ_(N) is LLL reduced basis we say that the generator matrix G is LLL reduced.

Reduction algorithm. We describe two versions of the LLL reduction algorithm. The first version is independent of the specific realization of the lattice and the second version is defined in terms of upper triangular realizations.

Invariant form. The invariant LLL reduction algorithm accepts as input a based Hermitian lattice (V,G,R) and produces as output a (change of basis) based Hermitian lattice (V,G′,R) such that:

(1) G′=G∘T where T∈SL_(N)(

_(i)).

(2) λ′_(i)=G′(e_(i)),i=1, . . . N is an LLL reduced basis of Λ=G′(

_(i) ^(N))=G(

_(i) ^(N)).

In more details, the algorithm constructs a sequence of generator matrices:

G=G⁰,G¹, . . . G^(K)=G′,   (3.5)

such that G^(k+1)=G^(k)∘T^(k),T^(k)∈SL_(N)(

_(i)). Let T=T⁰∘T²∘ . . . ∘T^(N−1). The LLL reduced generator G^(K)=G∘T.

The sequence (3.5) is constructed using two types of change of basis transformations.

Size reduction transformation. Let j>i∈[1,N]. Reduce the size of the jth basis vector with respect to the ith basis vector according to the rule λ_(j) ^(k+1)=λ_(j) ^(k)−aλ_(i) ^(k) where:

$a = {\left\lbrack \frac{\lbrack R\rbrack_{i - 1}\left( {\lambda_{i}^{k},\lambda_{j}^{k}} \right)}{\lbrack R\rbrack_{i - 1}\left( {\lambda_{i}^{k},\lambda_{i}^{k}} \right)} \right\rbrack.}$

Here [-] stands for the closest Gaussian integer. This transformation is realized by the unimodular matrix:

${T^{k} = \begin{bmatrix} 1 & . & 0 & 0 & . & 0 \\ . & . & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & . & {- a} & 0 \\ 0 & 0 & 0 & . & . & 0 \\ . & 0 & 0 & 0 & 1 & . \\ 0 & 0 & 0 & 0 & . & 1 \end{bmatrix}},$

i.e., the unit matrix with additional non-zero entry T^(k)(i,j)=−a.

Flipping transformation. Let i∈[1,N−1]. Interchange the ith and the i+1th basis vectors according to the rule λ_(i) ^(k+1)=−λ_(i+1) ^(k) and λ_(i+1) ^(k+1)=λ_(i) ^(k). This transformation is realized by the unimodular matrix:

$T^{k} = \begin{bmatrix} 1 & . & 0 & 0 & . & 0 \\ . & . & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & {- 1} & 0 & 0 & 0 \\ . & 0 & 0 & 0 & . & . \\ 0 & 0 & 0 & 0 & . & 1 \end{bmatrix}$

i.e., the unit matrix except that

${T^{k}\left( {{i:{i + 1}},{i:{i + 1}}} \right)} = {\begin{bmatrix} 0 & 1 \\ {- 1} & 0 \end{bmatrix}.}$

Remark. The algorithm applies the size reduction step only if the size reduction condition is not satisfied and applies the flipping transformation always after applying size reduction transformation if and only if the well ordering condition is not satisfied.

Upper triangular form. The upper triangular version of the LLL reduction algorithm accepts as input an upper triangular based Hermitian lattice (

^(N),U,R₀) and produces as an output an upper triangular based Hermitian lattice (

^(N),U′,R₀) such that:

(1) U′=A∘U∘T where T∈SL_(N)(

_(i)) and A∈U_(N).

(2) λ′_(i)=U′(e_(i)),i=1, . . . N is an LLL reduced basis of Λ′=U′(

_(i) ^(N)).

In some embodiments, the algorithm constructs a sequence of upper triangular matrices:

U⁰=U,U¹, . . . U^(K)=U′,   (3.6)

such that U^(k+1)=A^(k)∘U^(k)∘T^(k) where T^(k)∈SL_(N)(

_(i)) and A^(K)∈U_(N). Let T=T⁰∘T²∘ . . . ∘T^(N−1) and let A=A⁰∘A²∘ . . . ∘A^(N−1). The reduced upper triangular realization is given by U^(K)=A∘U⁰∘T. In what follows we use the notation:

$U^{k} = {\begin{bmatrix} U_{11}^{k} & . & . & . & . & U_{1N}^{k} \\ 0 & U_{22}^{k} & . & . & . & . \\ 0 & 0 & . & . & . & . \\ . & . & . & . & . & . \\ . & . & . & 0 & . & . \\ 0 & . & . & 0 & . & U_{NN}^{k} \end{bmatrix}.}$

The sequence (3.6) is constructed using two types of transformations, each being a combination of a change of basis with a change of realization.

Size reduction transformation. Let j>i∈[1,N]. Reduce the size of the jth basis vector with respect to the ith basis vector according to the rule λ_(j) ^(k+1)=λ_(j) ^(k)−aλ_(i) ^(k) where:

$a = {\left\lbrack \frac{\lbrack R\rbrack_{i - 1}\left( {\lambda_{i}^{k},\lambda_{j}^{k}} \right)}{\lbrack R\rbrack_{i - 1}\left( {\lambda_{i}^{k},\lambda_{i}^{k}} \right)} \right\rbrack = {\left\lbrack \frac{\overset{\_}{U_{ii}^{k}} \cdot U_{ij}^{k}}{{U_{ii}^{k}}^{2}} \right\rbrack.}}$

This is realized by the unimodular main:

$T^{k} = {\begin{bmatrix} 1 & . & 0 & 0 & . & 0 \\ . & . & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & . & {- a} & 0 \\ 0 & 0 & 0 & . & . & 0 \\ . & 0 & 0 & 0 & 1 & . \\ 0 & 0 & 0 & 0 & . & 1 \end{bmatrix}.}$

i.e., the unit matrix with additional non-zero entry T^(k)(i,j)=−a. To conclude:

U^(k+1)=U^(k)∘T^(k).

Note that the upper triangular size reduction transformation has the same form as its invariant counterpart since U^(k)∘T^(k) is already in upper triangular form.

Flipping transformation. Let i∈[1,N−1]. Interchange between the ith and the i+1th basis vectors according to the rule λ_(i) ^(k+1)=−λ_(i+1) ^(k) and λ_(i+1) ^(k+1)=λ_(i) ^(k) followed by a change of realization to return to upper triangular form. The change of basis matrix is given by:

$T^{k} = {\begin{bmatrix} 1 & . & 0 & 0 & . & 0 \\ . & . & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & {- 1} & 0 & 0 & 0 \\ . & 0 & 0 & 0 & . & . \\ 0 & 0 & 0 & 0 & . & 1 \end{bmatrix}.}$

The change of realization matrix is defined as follows. Let G^(k+1)=U^(k)∘T^(k). One can verify that G^(k+1) is almost upper triangular except that L=G^(k+1)(i:i+1,i:i+1) is lower triangular. Let L=QR be the Q−R decomposition of L where Q is unitary and R upper triangular. We define:

$A^{k} = {\begin{bmatrix} 1 & . & 0 & 0 & . & 0 \\ . & . & 0 & 0 & 0 & 0 \\ 0 & 0 & Q_{11} & Q_{12} & 0 & 0 \\ 0 & 0 & Q_{21} & Q_{22} & 0 & 0 \\ . & 0 & 0 & 0 & . & . \\ 0 & 0 & 0 & 0 & . & 1 \end{bmatrix}.}$

i.e., A^(k) is the unit matrix except that A^(k)(i:i+1,i:i+1)=Q. To conclude:

U^(k+1)=A^(k)∘U^(k)∘T^(k).

Alternatively, we can define U^(k+1)=G^(k) except that U^(k+1)(i:i+1,i:i+1)=U where U is the upper triangular factor in the Cholesky factorization L*L=U*U.

The LLL reduction theory in dimension two. For pedagogical reasons it is worthwhile to explain the general theory in the special case when dim V=2. The set-up includes a based Hermitian lattice (V,G,R) where dim V=2. The vectors λ₁=G(e₁) and λ₂=G(e₂) form a basis of the corresponding rank 2 lattice Λ=G(

_(i) ²).

Definition. We say that λ₁,λ₂ is an LLL reduced basis of Λ if it satisfies the following two conditions:

(1) Size reduction condition:

|ReR(λ₁,λ₂)|≤½R(λ₁,λ₁).

|ImR(λ₁,λ₂)|≤½R(λ₁,λ₁).

(2) Well ordering condition:

R(λ₁,λ₁)≤R(λ₁,λ₁).

Note that the size reduction condition implies the inequality:

${{R\left( {\lambda_{1},\lambda_{2}} \right)}} \leq {\frac{1}{\sqrt{2}}{{R\left( {\lambda_{1},\lambda_{1}} \right)}.}}$

The nice thing about LLL reduced bases is that they consists of short vectors which are nearly orthogonal to one another. The quantitative meaning of this statement is the content of the following theorem. Theorem. (Reduction theorem). If λ₁,λ₂ is LLL reduced basis then:

(1) The vector λ₁ satisfies:

√{square root over (R(λ₁,λ₁))}≤c√{square root over (R(λ_(short),λ_(short)),)}

-   -   where

$c = {1/\sqrt{2 - \sqrt{2}}}$

and λ_(short) denotes the shortest non-zero vector in Λ.

(2) The vector λ₂ satisfies:

${\sqrt{R_{V_{1}^{\bot}}\left( {\lambda_{2},\lambda_{2}} \right)} \leq \sqrt{R_{V_{2}^{\bot}}\left( {\lambda,\lambda} \right)}},$

-   -   for every vector λ∈Λ with non-zero orthogonal projection on v₁         ^(⊥).

In words, the Theorem asserts that the first basis vector λ₁ is no longer than a scalar multiple of the shortest non-zero vector in the lattice where the scalar is universal (does not depend on the lattice) and that the second basis vector is the shortest non-zero vector mod λ₁.

Reduction algorithm—invariant form. The algorithm accepts as input a based Hermitian lattice (V,G,R) and produces as output a (change of basis) based Hermitian lattice (V,G′,R) such that:

(1) G′=G∘T, where T∈SL₂(

_(i)).

(2) λ′_(i)=G′(e_(i)),i=1.2 is an LLL reduced basis of Λ=G(

_(i) ²).

In some embodiments, the algorithm constructs a sequence of generator matrices:

G=G⁰,G¹, . . . G^(K)=G′.   (3.7)

Herein. G^(k+1)=G^(k)∘T^(k) where T^(k)∈SL₂(

_(i)). The sequence (3.7) is constructed by alternating between the two following basic transformations.

Size reduction transformation. Reduce the size of the second basis vector with respect to the first basis vector according to the rule λ₂ ^(k+1)=λ₂ ^(k)−aλ₁ ^(k) where:

$a = {\left\lbrack \frac{R\left( {\lambda_{1}^{k},\lambda_{2}^{k}} \right)}{R\left( {\lambda_{1}^{k},\lambda_{1}^{k}} \right)} \right\rbrack.}$

This transformation is realized by the unimodular matrix:

$T^{k} = {\begin{bmatrix} 1 & {- a} \\ 0 & 1 \end{bmatrix}.}$

Flipping transformation. Interchange the basis vectors according to the rule λ₁ ^(k+1)=−λ₂ ^(k) and λ₂ ^(k+1)=λ₁ ^(k). This transformation is realized by the unimodular matrix:

$T^{k} = {\begin{bmatrix} 0 & 1 \\ {- 1} & 0 \end{bmatrix}.}$

Reduction algorithm—upper triangular form. The algorithm accepts as input an upper triangular based Hermitian lattice (

²U,R₀) and produces as output an upper triangular based Hermitian lattice (

²,U′,R₀) such that:

(1) U′=A∘U∘T where T∈SL₂(

_(i)) and A∈U₂.

(2) λ′_(i)=U′(e_(i)),i=1,2 is an LLL reduced basis of Λ′=U′(

_(i) ²).

In some embodiments, the algorithm constructs a sequence of upper triangular generator matrices:

U=U⁰,U¹, . . . U^(k)=U′,   (3.8)

such that U^(k+1)=A^(k)∘G^(k)∘T^(k) where T^(k)∈SL₂(

_(i)) and A^(k)∈U₂. In what follows, we use the notation:

$U^{k} = {\begin{bmatrix} U_{11}^{k} & U_{12}^{k} \\ 0 & U_{22}^{k} \end{bmatrix}.}$

The sequence of generators in (3.8) is constructed by alternating between the following two types of transformations.

Size reduction transformation. Reduce the size of the second basis vector with respect to the first basis vector according to the rule λ₂ ^(k+1)=λ₂ ^(k)−aλ₁ ^(k) where:

$a = {\left\lbrack \frac{R\left( {\lambda_{1}^{k},\lambda_{2}^{k}} \right)}{R\left( {\lambda_{1}^{k},\lambda_{1}^{k}} \right)} \right\rbrack.}$

This transformation is realized by the unimodular matrix:

$T^{k} = {\begin{bmatrix} 1 & {- a} \\ 0 & 1 \end{bmatrix}.}$

Flipping transformation. The basis vectors are flipped according to the rule λ₂ ^(k+1)=−λ₁ ^(k) and λ₁ ^(k+1)=λ₂ ^(k). This change of basis is realized by the unimodular matrix:

$\begin{matrix} {T^{k} = {\begin{bmatrix} 0 & 1 \\ {- 1} & 0 \end{bmatrix}.}} & \; \end{matrix}$

The resulting matrix G^(k+1)=U^(k)∘T^(k) is lower triangular and is transformed back to upper triangular form by change of realization A^(k)=Q, where Q is the Unitary factor in the QR decomposition of G^(k+1). Alternatively. U^(k+1)=U—the upper triangular multiplier in the Cholesky decomposition G^(k+1)*G^(k+1)=U*U.

Convergence of the algorithm. The convergence of the reduction algorithm can be proved by energy considerations. We define the energy functional:

E(G)=R(λ₁,λ₁)²[R]₁(λ₂,λ₂).

Considering the sequence in (3.7), we have the following theorem.

Theorem. The following inequality always hold:

E(G ^(k+1))≤E(G ^(k)).

moreover, when G^(k) is flipped we have E(G^(k+1))≤αE(G^(k)) for some α<1.

3.2 Examples of Lattice Detection

Closest lattice point (CLP) problem. The CLP problem emerges as a relaxation of a hard finite quadratic minimization problem called the closest constellation point (CCP) problem. We consider the following general set up:

(1) Let (V,R) be an Hermitian vector space of dimension N.

(2) Let G:

^(N)→V be the generator of a lattice Λ=G(

_(i) ^(N)).

(3) Let {circumflex over (x)}∈V be a point in V called the soft estimation.

(4) Let Ω=(Λ+v₀)∩B where B is some bounded set and v₀∈V is some translation vector. The set Ω is called the set of constellation points.

The CCP problem is defined by:

$\begin{matrix} {{\hat{x}}^{h} = {\arg \mspace{14mu} {\min\limits_{p}\left\{ {{R\left( {\hat{x} - p} \right)}:{p \in \Omega}} \right\}}}} & (3.9) \end{matrix}$

Let {circumflex over (x)}_(t)={circumflex over (x)}−v₀. The CLP problem is the following relaxation of (3.9):

$\begin{matrix} {a^{*} = {\arg \mspace{14mu} {\min\limits_{a}\mspace{14mu} {\left\{ {{R\left( {{\hat{x}}_{t} - {G(a)}} \right)}:{a \in {\mathbb{Z}}_{i}^{N}}} \right\}.}}}} & (3.10) \end{matrix}$

Consequently, the derived hard estimation of {circumflex over (x)} is given by {circumflex over (x)}^(h)=G(a*)+v₀. Interestingly, the price in performance due to lattice relaxation is negligible.

Remark. Note that the standard QAM constellation sets can be viewed as a truncation of translated Gaussian lattices.

Boosting CLP via lattice reduction. The main advantage in considering the lattice relaxation of the CCP problem is that lattices in contrast to constellation sets admit various bases and realizations which can be used to reduce the complexity of the CLP minimization problem. In particular, let (

^(N),U,R₀) be an LLL reduced upper triangular realization of (V,G,R). Recall that

U=A∘G∘T,

where:

(1) U is upper triangular.

(2) T∈SL_(N)(

_(i)).

(3) A:V

^(N) is an isomorphism satisfying A*(R₀)=R.

(4) λ_(i)=U(e_(i)),i=1, . . . N is an LLL reduced basis of the lattice Λ=U(

_(i) ^(N)).

Note, that once T is known (for example after applying the invariant LLL reduction algorithm) the matrix U can be obtained via the Cholesky decomposition:

(G∘T)*(R)=U ^(H) U.

Let {circumflex over (z)}=A({circumflex over (x)}_(t)). The CLP problem in the reduced realization is defined by:

$a^{*} = {\arg \; {\min\limits_{a}{\left\{ {{{\hat{z} - {U(a)}}}^{2}:{a \in {\mathbb{Z}}_{i}^{N}}} \right\}.}}}$

The hard decision {circumflex over (x)}^(h) is derived from the “reduced” minimum a* by the rule {circumflex over (x)}^(h)=G∘T(a*)+v₀. We refer to the CLP problem in the reduced basis as LR CLP.

Lattice detection problems. Herein, CLP relaxations of various detection problems originating from communication theory are described.

Lattice detection problem for a single tap channel. First consider the simpler case of a single tap channel model:

y=h(x)+w,   (3.11)

where:

(1) h:V→

^(M) is the channel transformation.

(2) x∈Ω is the transmit vector assumed to belong to the constellation set.

(3) y∈

^(M) is the received vector.

(4) w∈

^(M) is the noise term assumed to be a Gaussian random vector w∈

(0,R_(ww)) of mean zero and covariance matrix R_(ww).

In order to define the CLP problem we assume a Gaussian prior x∈

(0,R_(xx)). Typically, R_(xx)=P·Id for some positive power P. Under these assumptions, the conditional probability of x given y takes the form:

(x|y)∞e^(−R)({circumflex over (x)}−x,{circumflex over (x)}−x),

where R=R_(ee) ⁻¹=(h*R_(ww) ⁻¹h+R_(xx) ⁻¹) and {circumflex over (x)}=Cy=R_(ee)h*R_(ww) ⁻¹y. In words, {circumflex over (x)} is the MMSE estimator of x in terms of y and R_(ee) is the covariance matrix of the residual error e=x−{circumflex over (x)}. The detection problem is defined by:

$\begin{matrix} {{\hat{x}}^{h} = {\arg \; {\max\limits_{x}\left\{ {{{\mathbb{P}}\left( x \middle| y \right)}:{x \in \Omega}} \right\}}}} \\ {= {\arg \; {\min\limits_{x}{\left\{ {{R\left( {\hat{x} - x} \right)}:{x \in \Omega}} \right\}.}}}} \end{matrix}$

Consequently, the CLP relaxation is defined by:

${a^{*} = {\arg \; {\min\limits_{a}\left\{ {{R\left( {{\hat{x}}_{t} - {G(a)}} \right)}:{a \in {\mathbb{Z}}_{i}^{N}}} \right\}}}},$

where {circumflex over (x)}_(t)={circumflex over (x)}−v₀.

Lattice detection problem for a multi-tap channel. Now consider the dispersive channel model:

y=h(x)+w,   (3.12)

where:

(1) h is the sequence of channel taps, h[k]∈Hom(V,

^(M)),k∈

.

(2) x is the transmit sequence, x[n]∈Ω,n ∈

.

(3) y is the received sequence, y[n]∈

^(N),n∈

.

(4) w is the noise sequence assumed to be uncorrelated in time with w[n]∈

(0,R_(ww))

We further assume that the channel is causal, that is, h[k]=0 for every k<0 and finite, that is, h[k]=0 for every k>v.

The term h(x) in (3.12) stands for the convolution: h(x)=Σ_(k=0) ^(v)h[k](x[n−k]). In order to define the CLP problem we assume an uncorrelated Gaussian prior on x where x[n]∈

(0,R_(xx)). Under these assumptions, we define {circumflex over (x)} to be the decision feedback Wiener estimate of x, given by:

{circumflex over (x)}=f(y)+b(x).   (3.13)

Herein,

(1) f=f[−μ], . . . f[0]∈Hom(

^(M),V),μ∈

^(≥1) is the Wiener forward filter.

(2) b=b[1], . . . b[v]∈Hom(V,V) is the strict Wiener feedback filter.

For example, (3.13) is the sum of two convolution terms:

${\hat{x}\lbrack n\rbrack} = {{\sum\limits_{k = {- \mu}}^{0}\; {{f\lbrack k\rbrack}\left( {y\left\lbrack {n - k} \right\rbrack} \right)}} + {\sum\limits_{k = 1}^{\nu}{{b\lbrack k\rbrack}{\left( {x\left\lbrack {n - k} \right\rbrack} \right).}}}}$

It can be shown that the error sequence e[n]=x[n]−{circumflex over (x)}[n] is uncorrelated in time with e[n]∈

(0,R_(ee)). Taking the Hermitian form R=R_(ee) ⁻¹, the detection problem at time n is defined by:

${{\hat{x}}^{h}\lbrack n\rbrack} = {\arg {\min\limits_{x}{\left\{ {{R\left( {{\hat{x}\lbrack n\rbrack} - {x\lbrack n\rbrack}} \right)}:{x \in \Omega}} \right\}.}}}$

Consequently, the CLP relaxation at time n is defined by:

${{a^{*}\lbrack n\rbrack} = {\arg {\min\limits_{a}\left\{ {{R\left( {{{\hat{x}}_{t}\lbrack n\rbrack} - {G(a)}} \right)}:{a \in {\mathbb{Z}}_{i}^{N}}} \right\}}}},$

where {circumflex over (x)}_(t)[n]={circumflex over (x)}[n]−v₀.

Computation of DF Weiner filters and the residual covariance matrix. The first step is to translate the channel model in (3.12) into a matrix form. To this end, define the column vectors:

Y = [y[0]^(T), …  , y[μ]^(T)]^(T) ∈ Mat  (n_(y)M, 1), X = [x[−ν]^(T), …  , x[μ]^(T)]^(T) ∈ Mat  (n_(x)M, 1), W = [w[0]^(T), …  , w[μ]^(T)]^(T) ∈ Mat  (n_(y)M, 1).

where n_(y)=μ+1,n_(x)=μ+v+1. We define the matrix:

$H = {\begin{bmatrix} {h\lbrack\nu\rbrack} & . & . & {h\lbrack 0\rbrack} & 0 & . & 0 \\ 0 & . & . & . & . & . & . \\ . & . & . & . & . & . & 0 \\ 0 & . & 0 & {h\lbrack\nu\rbrack} & . & . & {h\lbrack 0\rbrack} \end{bmatrix} \in {{Mat}\mspace{14mu} {\left( {{n_{y}M},{n_{x}N}} \right).}}}$

Under these conventions, the channel model translates to the matrix equation:

Y=HX+W.   (3.14)

Furthermore, define the Weiner inverse channel by:

X=CY+E,   (3.15)

where C∈Mat(n_(x)N,n_(y)M) is the Wiener estimator of X from Y and E is the residual error which, by definition, is uncorrelated with Y. We have the following formulas:

R_(EE) = (H^(H)R_(WW)⁻¹H + R_(XX)⁻¹)⁻¹, C = R_(EE)H^(H)R_(WW)⁻¹.

Consider R_(EE) as an n_(x)×n_(x) block matrix. Let R_(EE)=LDU be the block LDU decomposition where U is block upper triangular with I_(M) on every block of the diagonal. D is block diagonal and L=U^(H).

Apply L⁻¹ to both sides of the inverse channel equation (3.15) to get:

L ⁻¹ X=L ⁻¹ CY+L ⁻¹ E.   (3.16)

The equation (3.16) exhibits the property that the new noise term L⁻¹E is uncorrelated in time with R{tilde over (E)}{tilde over (E)}=D. Write L⁻¹=I−B. Note that B is strictly block lower triangular. Substituting in (3.16), we get

X=L ⁻¹ CY+BX+L ⁻¹ E.   (3.17)

Taking the 0th block coordinate of both sides of Equation (3.17), we get:

x[0]=e ₀ ^(T) L ⁻¹ CY+e ₀ ^(T) BX+e ₀ ^(T) L ⁻¹ E.

We have:

(1) f[k]=e₀ ^(T)L⁻¹C[−k], for k=−μ, . . . 0.

(2) b[k]=e₀ ^(T)B[−k], for k=1, . . . v.

(3) R_(ee)=D_(0,0).

Lattice detection problem for a multi-tap

D channel. Let N_(f)∈

^(≥1). Let

/N_(f) denote the ring of integers modulo N_(f). An element l∈

/N_(f) is called a frequency slot. We consider the dispersive 2D channel model:

y=h(x)+w,   (3.18)

where:

(1) h[k,l]∈Hom(V,

^(M)),k∈

,l∈

/N_(f), called the 2D channel.

(2) x[n,l]∈Ω,n∈

,l∈

/N_(f), called the transmit sequence.

(3) y[n,l]∈

,n∈

,l∈

/N_(f), called the receive sequence.

(4) w[n,l]∈

,n∈

,l∈

/N_(f), called the noise sequence.

The noise sequence is assumed to be uncorrelated in time and in frequency with w[n,l]∈

(0,R_(ww)). We further assume that the channel is causal and finite in time, that is, h[k,l]=0 for every k<0 and k>v.

The term h(x) in (3.18) stands for 2D convolution:

${{{h(x)}\left\lbrack {n,l} \right\rbrack} = {\sum\limits_{k = 0}^{\nu}{\sum\limits_{{l_{1} + l_{2}} = l}\; {{h\left\lbrack {k,l_{1}} \right\rbrack}\left( {x\left\lbrack {{n - k},l_{2}} \right\rbrack} \right)}}}},$

where the equation l₁+l₂=l under the second sum is taken modulo N_(f). Hence, the first convolution is linear and the second convolution is cyclic. We assume an uncorrelated Gaussian prior on x where x[n,l]∈

(0,R_(xx)). Under these assumptions, we define {circumflex over (x)} to be the 2D decision feedback Wiener estimate of x, given by:

{circumflex over (x)}=f(y)+b(x),   (3.19)

where:

(1) f=f[−μ,l], . . . f [0,l]∈Hom(

^(M),V),μ∈

^(≥1) is the 2D Wiener forward filter.

(2) b=b[1,l], . . . b[v,l]∈Hom(V,V) is the strict 2D Wiener feedback filter.

More explicitly, (3.19) is the sum of two convolution terms:

${\hat{x}\left\lbrack {n,l} \right\rbrack} = {{\sum\limits_{k = {- \mu}}^{0}{\sum\limits_{{l_{2} + l_{2}} = l}{{f\left\lbrack {k,l_{1}} \right\rbrack}\left( {y\left\lbrack {{n - k},l_{2}} \right\rbrack} \right)}}} + {\sum\limits_{k = 1}^{\nu}{\sum\limits_{{l_{1} + l_{2}} = l}{{b\left\lbrack {k,l_{1}} \right\rbrack}{\left( {x\left\lbrack {{n - k},l_{2}} \right\rbrack} \right).}}}}}$

It can be shown that the error sequence e[n,l]=x[n,l]−{circumflex over (x)}[n,l] is uncorrelated in time and Toeplitz in frequency with e[n,−]∈

(0,R_(ee)),

$R_{ee} = {\begin{bmatrix} r_{11} & r_{12} & . & . & r_{1N_{f}} \\ r_{12}^{H} & r_{11} & r_{12} & . & . \\ . & . & . & . & . \\ . & . & . & . & r_{12} \\ r_{1N_{f}}^{H} & . & . & r_{12}^{H} & r_{11} \end{bmatrix}.}$

In the definition of the CCP problem we ignore the correlations between different frequencies, by taking R=r₁₁ ⁻¹. The detection problem at time n and frequency l is defined by:

${{\hat{x}}^{h}\left\lbrack {n,l} \right\rbrack} = {\arg \; {\min\limits_{x}{\left\{ {{R\left( {{\hat{x}\left\lbrack {n,l} \right\rbrack} - {x\left\lbrack {n,l} \right\rbrack}} \right)}:{x \in \Omega}} \right\}.}}}$

Consequently, the CLP relaxation at time n and frequency l is defined by:

${{a^{*}\left\lbrack {n,l} \right\rbrack} = {\arg \; {\min\limits_{a}\left\{ {{R\left( {{{\hat{x}}_{t}\left\lbrack {n,l} \right\rbrack} - {G(a)}} \right)}:{a \in {\mathbb{Z}}_{i}^{N}}} \right\}}}},$

where {circumflex over (x)}_(t)[n,l]={circumflex over (x)}[n,l]−v₀. Remark. Ignoring the cross correlations between different frequency slots in the definition of the CCP problem leads to a sub-optimal relaxation of the joint detection problem of x[n,−]∈Ω^(N) ^(f) . When these cross terms are small compared to the diagonal term r₁₁ we expect the loss in performance to be small.

Computation of 2D Wiener filters via Fourier domain representation. Herein, reducing the computation of 2D Wiener filters f[k,l] and b[k,l] and the residual covariance matrix to a 1D computation is described. To this end, the (normalized) DFT along the frequency dimension is applied to the 2D channel model in (3.18). This yields a channel model of the form:

Y=H(X)+W,   (3.20)

where:

(1) H[k,l]∈DFT_(l)(h[n,l]),k∈

,l∈

/N_(f).

(2) X[n,l]=DFT_(l)(x[n,l]),n∈

,l∈

/N_(f).

(3) Y[n,l]=DFT_(l)(y[n,l]),n∈

,l∈

/N_(f).

(4) W[n,l]=DFT_(l)(w[n,l]),n∈

,l∈

/N_(f).

We refer to (3.20) as the Fourier domain representation of the 2D channel model (3.18).

Since the DFT is a Unitary transformation, the Fourier noise term remains uncorrelated in time and frequency with W[n,l]∈

(0,R_(ww)). Most importantly, the DFT converts cyclic convolution to multiplication, hence, the Fourier 2D channel model is composed of N_(f) non-interacting 1D channels, namely:

${{{H(X)}\left\lbrack {n,l} \right\rbrack} = {\sum\limits_{k = 0}^{\nu}{{H\left( {k,l} \right)}\left( {X\left\lbrack {{n - k},l} \right\rbrack} \right)}}},$

for every l=0, . . . N_(f)−1. Let {circumflex over (X)}=F(Y)+B(X) denote the DF Wiener estimate of X in terms of Y where F[k,l]∈Hom(

^(M),V) and B[k,l]∈Hom(V,V) are the forward and backward Wiener filters respectively. One can show that the residual error E=X−{circumflex over (X)} is uncorrelated in time and frequency with E[n,l]∈

(0,R_(EE)(l)). l=0, . . . N_(f)−1. We have:

${{f\left\lbrack {k,l} \right\rbrack} = {{IDFT}_{l}\left( {F\left\lbrack {k,l} \right\rbrack} \right)}},{{b\left\lbrack {k,l} \right\rbrack} = {{IDFT}_{l}\left( {B\left\lbrack {k,l} \right\rbrack} \right)}},{r_{11} = {\frac{1}{N_{f}}{\sum\limits_{l = 0}^{N_{f} - 1}{{R_{EE}(l)}.}}}}$

3.3 Examples of CLP Detection Algorithms

In this section, we describe various algorithms for finding the CLP. The algorithms fall into two classes: exact algorithm which find the true minimum and heuristic algorithm which find an approximation of the minimum. The trade-off is complexity. All algorithms incorporate a fundamental weighted tree structure representation of the lattice where leaves of the tree are in 1-1 correspondence with vectors in the lattice. This enables to localize the search problem enabling a branch and bound searching strategies. The different between the various algorithms is the strategy applied to search through the tree.

Set-up. We assume the following set-up:

(1) Let (V,G,R) be a based Hermitian lattice.

(2) Let Λ=G(

_(i) ^(N)) be the associated lattice with basis λ_(i)=G(e_(i)),i=1, . . . N.

(3) Let {circumflex over (x)}∈V be the soft estimation.

The CLP problem is defined by:

$a^{*} = {\arg \; {\min\limits_{a}{\left\{ {{R\left( {\hat{x} - {G(a)}} \right)}:{a \in {\mathbb{Z}}_{i}^{N}}} \right\}.}}}$

In words: find the point in the lattice Λ that is the closest to {circumflex over (x)} with respect to the Euclidean metric R.

The lattice tree. There exists a canonical weighted tree structure

associated with (V,G,R) and {circumflex over (x)}, called the lattice tree. Formally, the lattice tree is a triple

=(V,∈,w) where:

(1) V is the set of vertices.

(2) ∈ is the set of edges.

(3) w:∈→

^(≥0) is a weight function on the edges.

Topology of the lattice tree. The topology of the lattice tree is defined in terms of the standard lattice

_(i) ^(N). The set of vertices is a disjoint union of N+1 layers:

${V = {\underset{k = 0}{\overset{N}{\sqcup}}V^{k}}},$

where V^(k)={a∈

_(i) ^(N):a[k]=0,k=1, . . . N−K} is the set of vertices at layer k. Consequently, the set of edges is a disjoint union N layers:

${ɛ = {\underset{k = 0}{\overset{N - 1}{\sqcup}}ɛ^{k}}},$

where ∈^(k)⊂V^(k)×V^(k+1) is the set of edges connecting layer k with layer k+1. An edge in ∈^(k) is an ordered pairs (a,b) such that b[l]=a[l] for every l≥N−K+1. Note that

is a uniform infinite tree of depth N. The root of

is the zero vector 0∈

_(i) ^(N). The zero layer V⁰={0}. All the branches are of length N and are in one to one correspondence with vectors in

_(i) ^(N). Under this correspondence a vector a∈

_(i) ^(N) is associated with the branch:

${a^{0}\overset{e^{0}}{\rightarrow}{a^{1}\overset{e^{1}}{\rightarrow}}}\mspace{11mu},\ldots \mspace{14mu},{\overset{e^{N - 1}}{\rightarrow}a^{N}},$

where a^(k)∈V^(k) and e^(k)=(a^(k),a^(k+1))∈∈^(k). The vertex a^(k) is defined by:

${a^{k}\lbrack l\rbrack} = \left\{ {\begin{matrix} {a\lbrack l\rbrack} & {l \geq {N - K + 1}} \\ 0 & {otherwise} \end{matrix}.} \right.$

Finally, we introduce the following notation. Given a vertex a∈V^(k), we denote by out (a)⊂∈^(k) the edges emanating from a and if k≥1, we denote by in(a)∈∈^(k−1) the unique edge that points to a.

The weight structure on the lattice tree. The weight function w is defined in terms of the based Hermitian lattice (V,G,R) and the soft estimation {circumflex over (x)}. Our approach is to first describe a vertex weight function μ:V→

^(≥0) and than derive w as a localized version of μ. The precise relation between w and μ is:

${\mu \left( a^{k} \right)} = {\sum\limits_{l = 0}^{k - 1}{{\omega \left( e^{l} \right)}.}}$

for every a^(k)∈V^(k) where:

${0 = {a^{0}\overset{e^{0}}{\rightarrow}{a^{1}\overset{e^{1}}{\rightarrow}}}}\mspace{14mu},\ldots \mspace{14mu},{\overset{e^{k - 1}}{\rightarrow}a^{k}},$

is the unique branch connecting the root with a^(k). Let us fix a vertex a^(k)∈V^(k).

Definition. The weight μ(a^(k)) is defined by:

$\begin{matrix} {{\mu \left( a^{k} \right)} = {\lbrack R\rbrack_{N - k}{\left( {\hat{x} - {G\left( a^{k} \right)}} \right).}}} & (3.21) \end{matrix}$

In words, μ(a^(k)) is the square distance between the vector {circumflex over (x)} and the vector G(a^(k)) measured mod λ₁, . . . λ_(N−k). In order to define w we introduce the notation:

{circumflex over (x)} _(k) =P _(V) _(N−k) ({circumflex over (x)}−G(a ^(k))).   (3.22)

Let e^(k)=(a^(k),a^(k+1))∈∈^(k).

Definition. The weight w(e^(k)) is defined by:

$\begin{matrix} {{\omega \left( e^{k} \right)} = {\lbrack R\rbrack_{N - k - 1}{\left( {{\hat{x}}_{k} - {{a^{k + 1}\left\lbrack {N - k} \right\rbrack}\lambda_{N - k}}} \right).}}} & (3.23) \end{matrix}$

The relation between the vertex and edge weights is formalized in the following proposition.

Proposition. We have:

μ(a ^(k+1))=μ(a ^(k))+w(e ^(k)).

Explicit formulas for the weight function. In order to write the weight values in explicit form we assume that (V,G,R) is an upper triangular realization, that is:

(1) V=

^(N).

(2) G=U is upper triangular matrix.

(3) R=

-,-

is the standard Hermitian product on

^(N).

In this realization:

V _(k)={(x ₁ , . . . x _(N)):x _(i)=0,i≥k+1},

V _(k) ^(⊥)={(x ₁ , . . . x _(N)):x _(i)=0,i≤k},

for every k=0, . . . N. The orthogonal projections P_(V) _(k) and P_(V) _(k) ^(⊥) are given by x_(k)=P_(V) _(k) (x) and x_(k) ^(⊥=P) _(V) _(k) ^(⊥)(x) where:

${x_{k}\lbrack n\rbrack} = \left\{ {\begin{matrix} {x\lbrack n\rbrack} & {n \leq k} \\ 0 & {otherwise} \end{matrix},{{x_{k}^{\bot}\lbrack n\rbrack} = \left\{ {\begin{matrix} {x\lbrack n\rbrack} & {n \geq {k + 1}} \\ 0 & {otherwise} \end{matrix}.} \right.}} \right.$

Given a vertex a^(k)∈V^(k), the weight μ(a^(k)) is given by:

$\begin{matrix} \begin{matrix} {{\mu \left( a_{k} \right)} = {R_{V_{N - k}^{\bot}}\left( {\hat{x} - {U\left( a^{k} \right)}} \right)}} \\ {= {{P_{V_{N - k}^{\bot}}\left( {\hat{x} - {U\left( a^{k} \right)}} \right)}}^{2}} \\ {= {\sum\limits_{l = {N - k + 1}}^{N}{{{d(l)}}^{2}.}}} \end{matrix} & (3.24) \end{matrix}$

where d={circumflex over (x)}−U(a^(k)). Given an edge e^(k)=(a^(k),a^(k+1))∈∈^(k), the weight w(e^(k)) is given by:

$\begin{matrix} \begin{matrix} {{\omega \left( e^{k} \right)} = {\lbrack R\rbrack_{N - k - 1}\left( {{\hat{x}}_{k} - {{a^{k + 1}\left\lbrack {N - k} \right\rbrack}\lambda_{N - k}}} \right)}} \\ {= {{P_{V_{N - {({k + 1})}}^{\bot}}\left( {{\hat{x}}_{k} -^{k + 1}{\left\lbrack {N - k} \right\rbrack \lambda_{N - k}}} \right)}}^{2}} \\ {{= {{{{\hat{x}}_{k}\left\lbrack {N - k} \right\rbrack} - {{a^{k + 1}\left\lbrack {N - k} \right\rbrack}U_{{N - k},{N - k}}}}}^{2}},} \end{matrix} & (3.25) \end{matrix}$

where {circumflex over (x)}_(k)=P_(V) _(N−k) ({circumflex over (x)}−G(a^(k))).

Remark. In some embodiments, working in an upper triangular realization is beneficial for calculation reasons as the weight formulas can be explicitly expressed in terms of the coefficient of the generator U by a simple recursive formula.

Babai detector. The Babai detector is the simplest heuristic detector. The output of the Babai detector is the branch a*∈

_(i) ^(N).

${a^{*0}\overset{e^{*0}}{\rightarrow}{a^{*1}\overset{e^{*1}}{\rightarrow}}}\mspace{11mu},\ldots \mspace{14mu},{\overset{e^{{*N} - 1}}{\rightarrow}a^{*N}},$

defined recursively as:

${e^{*k} = {\arg \; {\min\limits_{e}\left\{ {{\omega (e)}:{e \in {{out}\mspace{14mu} \left( a^{*k} \right)}}} \right\}}}},$

for every k=0, . . . N−1. In words, the Babai detector is a greedy algorithm that picks the branch that at every layer extends to the next layer via the edge of minimal weight.

Remark. In some embodiments, the Babai lattice detector is equivalent to decision feedback MMSE. However, the performance of the Babai detector after applying LLL basis reduction is significantly superior to the standard DF MMSE.

Sphere detector. The sphere detector is an exact detector. The sphere detector travel through the lattice tree according to a depth first search strategy where the edges in out (a^(k)) are chosen in ascending order according to their weight. The algorithm incorporates a threshold parameter T, initialized in the beginning to T=∞. The algorithm follow the following two rules:

(1) When reaching a leaf a^(N)∈V^(N) update T←μ(a^(N)).

(2) Branch from a^(k)∈V^(k) to a^(k+1)∈V^(k+1) only if μ(a^(k+1))<T.

FIG. 9 is a flowchart of a method 900 for wireless data reception. The method 900 includes, at 910, receiving a signal comprising information bits modulated using orthogonal time frequency space (OTFS) modulation scheme, wherein each delay-Doppler bin in the signal is modulated using a quadrature amplitude modulation (QAM) mapping. The method 900 may lead to successful decoding and extraction of the information bits from the signal, using techniques described in the present document.

The method 900 includes, at 920, estimating the information bits by inverting and pre-processing a single error covariance matrix representative of estimation error for all delay-Doppler bins. In other words, estimating the information bits is based on an inverse of a single error covariance matrix of the signal.

As described in the present document, in some implementations, the method 900 may further include computing a unimodular matrix U, and improving numerical conditioning (or equivalently, decreasing the condition number) of the single error covariance matrix by multiplying with U. While the matrix U could be calculated using several methods, e.g., a brute force method, in some embodiments a lattice reduction algorithm (including the exemplary implementations described in Sections 2 and 3) may be used to reduce numerical complexity. In some embodiments, the lattice reduction algorithm may include a size reduction transformation followed by a flipping transformation. In an example, the size reduction transformation may be based on a first unimodular matrix and the flipping transformation may be based on a second (different) unimodular matrix.

The method 900 may also include lattice detection. Further, this may include conversion of the above detected lattice to standard lattice (QAM lattice) and estimating information bits by performing symbol to bit de-mapping. In some embodiments, the method 900 includes, for each delay-Doppler bin, performing a Babai detection (lattice detection) on the output of the lattice reduction algorithm. In some embodiments, the method 900 includes, for each delay-Doppler bin, performing a sphere detection (lattice detection) on the output of the lattice reduction algorithm. In some embodiments, and more generally, the method 900 includes, for each delay-Doppler bin, performing a closest lattice point (CLP) detection on the output of the lattice reduction algorithm. In some embodiments, the lattice reduction algorithm may be implemented using the LLL algorithm, a Block Korkine Zolotarev (BKZ) algorithm, a random sampling reduction (RSR) algorithm or a primal dual reduction (PDR) algorithm.

The method 900 may also include first determining that the inverse of the single error covariance matrix is numerically well-conditioned (or equivalently, having a condition number close to unity), and then performing a slicing operation on the QAM symbols in each of the delay-Doppler bins.

FIG. 10 is a block diagram of an example of a wireless communication apparatus 1000. The apparatus 1000 may include a processor 1002, a memory 1004 and transceiver circuitry 1006. The processor may implement various operations described in the present document that include, but are not limited to, the MIMO turbo equalizer described in Section 2 and the exemplary lattice reduction algorithm implementations described in Section 3. The memory may be used to store code and data that is used by the processor when implementing the operations. The transceiver circuit may receiver OTFS and other types of signals.

It will be appreciated that the present document discloses, among other features, techniques that allow for embodiments that provide near-maximum likelihood performance in OFTS systems. Ideally, maximum likelihood algorithm may produce theoretically best results. However, a Babai detector or a sphere decoder may be used by practical systems to achieve performance nearly as good as ML receivers. As discussed herein, lattice reduction may be used as a pre-processing step for the implementation of a Babai detector or a sphere decoder. In particular, for OTFS modulated signals, implementation of lattice reduction can be made computationally bearable (compared to OFDM signals) by reducing the complexity of implementation by having to invert a single R_(ee) matrix. Furthermore, conditioning of the matrix also becomes a numerically easier task because only one matrix needs to be processed, rather than multiple matrices in case of OFDM systems.

The disclosed and other embodiments, modules and the functional operations described in this document can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or in combinations of one or more of them. The disclosed and other embodiments can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more them. The term “data processing apparatus” encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them. A propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus.

A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a standalone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.

The processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).

Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random access memory or both. The essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

While this patent document contains many specifics, these should not be construed as limitations on the scope of an invention that is claimed or of what may be claimed, but rather as descriptions of features specific to particular embodiments. Certain features that are described in this document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable sub-combination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a sub-combination or a variation of a sub-combination. Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results.

Only a few examples and implementations are disclosed. Variations, modifications, and enhancements to the described examples and implementations and other implementations can be made based on what is disclosed. 

1. A wireless communication method, implementable by a wireless communication receiver apparatus, comprising: receiving a signal comprising information bits modulated using an orthogonal time frequency space (OTFS) modulation scheme, wherein each delay-Doppler bin in the signal is modulated using a quadrature amplitude modulation (QAM) mapping; and estimating the information bits based on an inverse of a single error covariance matrix of the signal, wherein the single error covariance matrix is representative of an estimation error for all delay-Doppler bins in the signal.
 2. The method of claim 1, further including: computing a unimodular matrix comprising integer entries and having a unity determinant; and decreasing a condition number of the inverse of the single covariance matrix based on multiplication with the unimodular matrix.
 3. The method of claim 1, further including: performing, upon determining that the inverse of the single error covariance matrix is numerically well-conditioned, a slicing operation on QAM symbols in all the delay-Doppler bins.
 4. The method of claim 2, wherein the computing the unimodular matrix includes applying a lattice reduction algorithm to obtain the unimodular matrix.
 5. The method of claim 4, wherein the applying the lattice reduction algorithm includes: applying a Lenstra Lenstra Lovasz (LLL) lattice reduction algorithm.
 6. The method of claim 4, further including, for each delay-Doppler bin, performing Babai detection on an output of the lattice reduction algorithm.
 7. The method of claim 4, further including, for each delay-Doppler bin, performing sphere detection on an output of the lattice reduction algorithm.
 8. The method of claim 4, further including, for each delay-Doppler bin, performing a closest lattice point (CLP) detection on an output of the lattice reduction algorithm.
 9. The method of claim 4, wherein the lattice reduction algorithm includes a size reduction transformation followed by a flipping transformation.
 10. The method of claim 9, wherein the size reduction transformation is based on the unimodular matrix, and wherein the flipping transformation is based on another unimodular matrix different from the unimodular matrix.
 11. The method of claim 4, wherein the lattice reduction algorithm comprises a Lenstra Lenstra Lovasz (LLL) algorithm, a Block Korkine Zolotarev (BKZ) algorithm, a random sampling reduction (RSR) algorithm or a primal dual reduction (PDR) algorithm.
 12. The method of claim 1, further including, for each delay Doppler bin, converting an output of detected symbols to a standard lattice.
 13. The method of claim 1, wherein the estimating the information bits includes performing a symbol to bits de-mapping of the QAM symbols. 14-15. (canceled)
 16. A wireless communication device comprising a processor and transceiver circuitry wherein the transceiver circuitry is configured for receiving a signal comprising information bits modulated using an orthogonal time frequency space (OTFS) modulation scheme, wherein each delay-Doppler bin in the signal is modulated using a quadrature amplitude modulation (QAM) mapping; and wherein the processor is configured for estimating the information bits based on an inverse of a single error covariance matrix of the signal, wherein the single error covariance matrix is representative of an estimation error for all delay-Doppler bins in the signal.
 17. The wireless communication device of claim 16, wherein the processor is further configured for: computing a unimodular matrix comprising integer entries and having a unity determinant; and decreasing a condition number of the inverse of the single covariance matrix based on multiplication with the unimodular matrix.
 18. The wireless communication device of claim 16, wherein the processor is further configured for: performing, upon determining that the inverse of the single error covariance matrix is numerically well-conditioned, a slicing operation on QAM symbols in all the delay-Doppler bins.
 19. The wireless communication device of claim 17, wherein the computing the unimodular matrix includes applying a lattice reduction algorithm to obtain the unimodular matrix.
 20. The wireless communication device of claim 19, wherein the applying the lattice reduction algorithm includes: applying a Lenstra Lenstra Lovasz (LLL) lattice reduction algorithm.
 21. The wireless communication device of claim 19, wherein the processor is further configured for, for each delay-Doppler bin, performing Babai detection on an output of the lattice reduction algorithm.
 22. The wireless communication device of claim 19, wherein the processor is further configure for, for each delay-Doppler bin, performing sphere detection on an output of the lattice reduction algorithm. 